Making Models
This document follows the 2_making_models.ipynb tutorial.
Basic Components
Models can be created by subclassing the ComponentModel
class. Models estimate the energy, area, and leakage power of a component. Each model
defines the following:
component_name: The name of the component. This may also be a list of components if multiple aliases are used.priority(optional): Determines which model is favored when multiple models support a given query. Higher-priority models are used first. Must be between 0 and 1; defaults to 0.5. Ties are broken by how closely each model matches the query (seeget_model()).A call to
super().__init__(area, leak_power, subcomponents). This is used to initialize the model and set the area and leakage power.
Models can also have actions. Actions are functions that return a tuple of (energy,
latency) for a specific action. For the TernaryMAC model, we have an action called
mac that returns the energy and latency of a ternary MAC operation. The
action() decorator makes this function visible as an
action. The function should return (energy_in_Joules, latency_in_seconds).
Models can also be scaled to support a range of different parameters. For example,
the TernaryMAC model can be scaled to support a range of different technology nodes.
This is done by calling the self.scale function in the __init__ method of the
model. The self.scale function takes the following arguments:
parameter_name: The name of the parameter to scale.parameter_value: The value of the parameter to scale.reference_value: The reference value of the parameter.area_scaling_function: The scaling function to use for area. UseNoneif no scaling should be done.energy_scaling_function: The scaling function to use for dynamic energy. UseNoneif no scaling should be done.latency_scaling_function: The scaling function to use for latency. UseNoneif no scaling should be done.leak_scaling_function: The scaling function to use for leakage power. UseNoneif no scaling should be done.include_subcomponents: Whether to also scale the subcomponents by the same factors. Defaults toTrue.
Note: Area, Energy, Latency, and Leak Power are always in alphabetical order in the function arguments.
Many different scaling functions are defined and available in
hwcomponents.scaling.
from hwcomponents import ComponentModel, action, ActionCost
from hwcomponents.scaling import (
tech_node_area,
tech_node_energy,
tech_node_leak,
tech_node_latency,
tech_node_throughput,
)
class TernaryMAC(ComponentModel):
"""
A ternary MAC unit, which multiplies two ternary values and accumulates the result.
Parameters
----------
accum_n_bits : int
The width of the accumulator in bits.
tech_node : int
The technology node in meters.
Attributes
----------
accum_n_bits : int
The width of the accumulator in bits.
tech_node : int
The technology node in meters.
"""
component_name: str | list[str] = "TernaryMAC"
""" Name of the component. Must be a string or list/tuple of strings. """
priority = 0.3
"""
Priority determines which model is used when multiple models are available for a
given component. Higher priority models are used first. Must be a number between 0
and 1.
"""
def __init__(self, accum_n_bits: int, tech_node: int):
# Provide an area and leakage power for the component. All units are in
# standard units without any prefixes (Joules, Watts, meters, etc.).
super().__init__(area=5e-12 * accum_n_bits, leak_power=1e-3 * accum_n_bits)
# Scale tech_node to the target from the 40nm reference.
self.tech_node = self.scale(
"tech_node",
tech_node,
40e-9,
area_scale_function=tech_node_area,
energy_scale_function=tech_node_energy,
latency_scale_function=tech_node_latency,
throughput_scale_function=tech_node_throughput,
leak_power_scale_function=tech_node_leak,
)
self.accum_n_bits = accum_n_bits
assert (
4 <= accum_n_bits <= 8
), f"Accumulation number of bits {accum_n_bits} outside supported range [4, 8]"
# The action decorator makes this function visible as an action. Return an
# ActionCost; throughput defaults to 1/latency if not given.
@action
def mac(self, clock_gated: bool = False):
"""
Returns the cost of one ternary MAC operation.
Parameters
----------
clock_gated : bool
Whether the MAC is clock gated during this operation.
Returns
-------
ActionCost
The cost of this action.
"""
self.logger.info(f"TernaryMAC Model is estimating energy for mac_random.")
if clock_gated:
return ActionCost(energy=0.0, throughput=float("inf"), latency=0.0)
return ActionCost(
energy=0.002e-12 * (self.accum_n_bits + 0.25),
throughput=float("inf"),
latency=0.0,
)
mac = TernaryMAC(accum_n_bits=8, tech_node=16e-9) # Scale the TernaryMAC to 16nm
cost = mac.mac()
print(
f"TernaryMAC energy is {cost.energy:.2e}J (throughput {cost.throughput:.2e} actions/s). "
f"Area is {mac.area:.2e}m^2. Leak power is {mac.leak_power:.2e}W"
)
Scaling by Number of Bits
Some actions may depend on the number of bits being accessesed. For example, you may
want to charge for the energy and latency per bit of a DRAM read. To do this, you can
use the bits_per_action argument of the action()
decorator. This decorator takes a string that is the name of the parameter to scale by.
For example, we can scale the energy and latency of a DRAM read by the number of bits
being read. In this example, the DRAM yields width bits per read, so energy and
latency are scaled by bits_per_action / width.
class LPDDR4(ComponentModel):
"""LPDDR4 DRAM energy model."""
component_name = ["DRAM", "dram"]
priority = 0.3
def __init__(self):
super().__init__(area=100e-3, leak_power=1e-3)
self.width = 1
# bits_per_action scales energy and latency by N and throughput by 1/N.
@action(bits_per_action="width")
def read(self) -> ActionCost:
"""
Returns the cost of one read.
Parameters
----------
bits_per_action : int
The number of bits to read.
Returns
-------
ActionCost
The cost of this action.
"""
return ActionCost(energy=8e-12, throughput=float("inf"), latency=0)
lpddr4 = LPDDR4()
r1 = lpddr4.read(bits_per_action=1)
r50 = lpddr4.read(bits_per_action=50)
print(f"Read energy for one bit: {r1.energy}")
print(f"Read energy for fifty bits: {r50.energy}")
print(f"Read throughput for one bit: {r1.throughput}")
print(f"Read throughput for fifty bits: {r50.throughput}")
Compound Models
We can create compound models by combining multiple component models. Here, we’ll show
the SmartBufferSRAM model from the hwcomponents-library package.This is an SRAM
with an address generator that sequentially reads addresses in the SRAM.
We’ll use the following components:
A SRAM buffer
Two registers: one that that holds the current address, and one that holds the increment value
An adder that adds the increment value to the current address
One new functionality is used here. The subcomponents argument to the
ComponentModel constructor is used to register
subcomponents.
The area, energy, latency, and leak power of subcomponents will NOT be scaled by the
component’s area_scale, energy_scale, latency_scale, and
leak_power_scale; if you want to scale the subcomponents, either call self.scale
with include_subcomponents=True, call the scale_area, scale_energy,
scale_latency, or scale_leak_power methods with include_subcomponents=True,
or call any of these on the subcomponents directly.
from hwcomponents_cacti import SRAM
from hwcomponents_library import AladdinAdder, AladdinRegister
from hwcomponents import ComponentModel, action
import math
class SmartBufferSRAM(ComponentModel):
"""
An SRAM with an address generator that sequentially reads addresses in the SRAM.
Parameters
----------
tech_node: The technology node in meters.
width: The width of the read and write ports in bits.
depth: The number of entries in the SRAM, each `width` bits.
n_rw_ports: The number of read/write ports.
n_banks: The number of banks.
"""
component_name = ["smart_buffer_sram", "smartbuffer_sram", "smartbuffersram"]
priority = 0.3
def __init__(
self,
tech_node: float,
width: int,
depth: int,
n_rw_ports: int = 1,
n_banks: int = 1,
):
self.sram: SRAM = SRAM(
tech_node=tech_node,
size=width * depth,
width=width,
depth=depth,
n_rw_ports=n_rw_ports,
n_banks=n_banks,
)
self.address_bits = max(math.ceil(math.log2(depth)), 1)
self.width = width
self.address_reg = AladdinRegister(width=self.address_bits, tech_node=tech_node)
self.delta_reg = AladdinRegister(width=self.address_bits, tech_node=tech_node)
self.adder = AladdinAdder(width=self.address_bits, tech_node=tech_node)
# If there are subcomponents, we can omit area and leak_power; their cost
# accumulates from the subcomponents.
super().__init__(
subcomponents=[
self.sram,
self.address_reg,
self.delta_reg,
self.adder,
]
)
@action(bits_per_action="width")
def read(self) -> ActionCost:
"""
Returns the cost of a read operation.
Parameters
----------
bits_per_action: int
The number of bits to read.
Returns
-------
ActionCost
The cost of this action.
"""
# Subcomponent costs are combined automatically: energy and latency summed,
# throughput min'd. Returning None is equivalent to zero per-component cost.
self.sram.read(bits_per_action=self.width)
self.address_reg.read()
self.delta_reg.read()
self.adder.add()
@action(bits_per_action="width")
def write(self) -> ActionCost:
"""
Returns the cost of a write operation.
Parameters
----------
bits_per_action: int
The number of bits to write.
Returns
-------
ActionCost
The cost of this action.
"""
self.sram.write(bits_per_action=self.width)
self.address_reg.write()
self.delta_reg.read()
self.adder.add()
smartbuffer_sram = SmartBufferSRAM(
tech_node=16e-9,
width=32,
depth=1024,
n_rw_ports=1,
n_banks=1,
)
r = smartbuffer_sram.read(bits_per_action=32)
w = smartbuffer_sram.write(bits_per_action=32)
print(f"Read energy: {r.energy} J, throughput: {r.throughput} actions/s")
print(f"Write energy: {w.energy} J, throughput: {w.throughput} actions/s")
print(f"Area: {smartbuffer_sram.area} m^2")
print(f"Leak power: {smartbuffer_sram.leak_power} W")
The latency of subcomponents is generally summed. However, if the subcomponents are
pipelined for a given action, then the pipelined_subcomponents argument to the
action() decorator should be set to True. This will cause
the latency of the action to be the max of the latency returned and all subcomponent
latencies.
Installing Models and Making them Globally Visible
An example model is provided in the notebooks/model_example directory, which can be
installed with the following command:
cd notebooks/model_example
pip3 install .
The README.md file in the notebooks/model_example directory contains information
on how to make models installable. Keep the following in mind while you’re changing the
model:
The model name should be prefixed with
hwcomponents_. This allows HWComponents to find the model when it is installed.The
__init__.pyfile should import all Model classes that you’d like to be visible to HWComponents.If you’re iterating on an model, you can use the
pip3 install -e .command to install the model in editable mode. This allows you to make changes to the model without having to reinstall it.