Making Models

This document follows the 2_making_models.ipynb tutorial.

Basic Components

Models can be created by subclassing the ComponentModel class. Models estimate the energy, area, and leakage power of a component. Each model defines the following:

  • component_name: The name of the component. This may also be a list of components if multiple aliases are used.

  • priority (optional): Determines which model is favored when multiple models support a given query. Higher-priority models are used first. Must be between 0 and 1; defaults to 0.5. Ties are broken by how closely each model matches the query (see get_model()).

  • A call to super().__init__(area, leak_power, subcomponents). This is used to initialize the model and set the area and leakage power.

Models can also have actions. Actions are functions that return a tuple of (energy, latency) for a specific action. For the TernaryMAC model, we have an action called mac that returns the energy and latency of a ternary MAC operation. The action() decorator makes this function visible as an action. The function should return (energy_in_Joules, latency_in_seconds).

Models can also be scaled to support a range of different parameters. For example, the TernaryMAC model can be scaled to support a range of different technology nodes. This is done by calling the self.scale function in the __init__ method of the model. The self.scale function takes the following arguments:

  • parameter_name: The name of the parameter to scale.

  • parameter_value: The value of the parameter to scale.

  • reference_value: The reference value of the parameter.

  • area_scaling_function: The scaling function to use for area. Use None if no scaling should be done.

  • energy_scaling_function: The scaling function to use for dynamic energy. Use None if no scaling should be done.

  • latency_scaling_function: The scaling function to use for latency. Use None if no scaling should be done.

  • leak_scaling_function: The scaling function to use for leakage power. Use None if no scaling should be done.

  • include_subcomponents: Whether to also scale the subcomponents by the same factors. Defaults to True.

Note: Area, Energy, Latency, and Leak Power are always in alphabetical order in the function arguments.

Many different scaling functions are defined and available in hwcomponents.scaling.

from hwcomponents import ComponentModel, action, ActionCost
from hwcomponents.scaling import (
    tech_node_area,
    tech_node_energy,
    tech_node_leak,
    tech_node_latency,
    tech_node_throughput,
)


class TernaryMAC(ComponentModel):
    """

    A ternary MAC unit, which multiplies two ternary values and accumulates the result.

    Parameters
    ----------
    accum_n_bits : int
        The width of the accumulator in bits.
    tech_node : int
        The technology node in meters.

    Attributes
    ----------
    accum_n_bits : int
        The width of the accumulator in bits.
    tech_node : int
        The technology node in meters.
    """

    component_name: str | list[str] = "TernaryMAC"
    """ Name of the component. Must be a string or list/tuple of strings. """

    priority = 0.3
    """
    Priority determines which model is used when multiple models are available for a
    given component. Higher priority models are used first. Must be a number between 0
    and 1.
    """

    def __init__(self, accum_n_bits: int, tech_node: int):
        # Provide an area and leakage power for the component. All units are in
        # standard units without any prefixes (Joules, Watts, meters, etc.).
        super().__init__(area=5e-12 * accum_n_bits, leak_power=1e-3 * accum_n_bits)

        # Scale tech_node to the target from the 40nm reference.
        self.tech_node = self.scale(
            "tech_node",
            tech_node,
            40e-9,
            area_scale_function=tech_node_area,
            energy_scale_function=tech_node_energy,
            latency_scale_function=tech_node_latency,
            throughput_scale_function=tech_node_throughput,
            leak_power_scale_function=tech_node_leak,
        )
        self.accum_n_bits = accum_n_bits

        assert (
            4 <= accum_n_bits <= 8
        ), f"Accumulation number of bits {accum_n_bits} outside supported range [4, 8]"

    # The action decorator makes this function visible as an action. Return an
    # ActionCost; throughput defaults to 1/latency if not given.
    @action
    def mac(self, clock_gated: bool = False):
        """
        Returns the cost of one ternary MAC operation.

        Parameters
        ----------
        clock_gated : bool
            Whether the MAC is clock gated during this operation.

        Returns
        -------
        ActionCost
            The cost of this action.
        """
        self.logger.info(f"TernaryMAC Model is estimating energy for mac_random.")
        if clock_gated:
            return ActionCost(energy=0.0, throughput=float("inf"), latency=0.0)
        return ActionCost(
            energy=0.002e-12 * (self.accum_n_bits + 0.25),
            throughput=float("inf"),
            latency=0.0,
        )


mac = TernaryMAC(accum_n_bits=8, tech_node=16e-9)  # Scale the TernaryMAC to 16nm
cost = mac.mac()
print(
    f"TernaryMAC energy is {cost.energy:.2e}J (throughput {cost.throughput:.2e} actions/s). "
    f"Area is {mac.area:.2e}m^2. Leak power is {mac.leak_power:.2e}W"
)

Scaling by Number of Bits

Some actions may depend on the number of bits being accessesed. For example, you may want to charge for the energy and latency per bit of a DRAM read. To do this, you can use the bits_per_action argument of the action() decorator. This decorator takes a string that is the name of the parameter to scale by. For example, we can scale the energy and latency of a DRAM read by the number of bits being read. In this example, the DRAM yields width bits per read, so energy and latency are scaled by bits_per_action / width.



class LPDDR4(ComponentModel):
    """LPDDR4 DRAM energy model."""

    component_name = ["DRAM", "dram"]
    priority = 0.3

    def __init__(self):
        super().__init__(area=100e-3, leak_power=1e-3)
        self.width = 1

    # bits_per_action scales energy and latency by N and throughput by 1/N.
    @action(bits_per_action="width")
    def read(self) -> ActionCost:
        """
        Returns the cost of one read.

        Parameters
        ----------
        bits_per_action : int
            The number of bits to read.

        Returns
        -------
        ActionCost
            The cost of this action.
        """
        return ActionCost(energy=8e-12, throughput=float("inf"), latency=0)


lpddr4 = LPDDR4()
r1 = lpddr4.read(bits_per_action=1)
r50 = lpddr4.read(bits_per_action=50)
print(f"Read energy for one bit: {r1.energy}")
print(f"Read energy for fifty bits: {r50.energy}")
print(f"Read throughput for one bit: {r1.throughput}")
print(f"Read throughput for fifty bits: {r50.throughput}")

Compound Models

We can create compound models by combining multiple component models. Here, we’ll show the SmartBufferSRAM model from the hwcomponents-library package.This is an SRAM with an address generator that sequentially reads addresses in the SRAM.

We’ll use the following components:

  • A SRAM buffer

  • Two registers: one that that holds the current address, and one that holds the increment value

  • An adder that adds the increment value to the current address

One new functionality is used here. The subcomponents argument to the ComponentModel constructor is used to register subcomponents.

The area, energy, latency, and leak power of subcomponents will NOT be scaled by the component’s area_scale, energy_scale, latency_scale, and leak_power_scale; if you want to scale the subcomponents, either call self.scale with include_subcomponents=True, call the scale_area, scale_energy, scale_latency, or scale_leak_power methods with include_subcomponents=True, or call any of these on the subcomponents directly.


from hwcomponents_cacti import SRAM
from hwcomponents_library import AladdinAdder, AladdinRegister
from hwcomponents import ComponentModel, action
import math


class SmartBufferSRAM(ComponentModel):
    """
    An SRAM with an address generator that sequentially reads addresses in the SRAM.

    Parameters
    ----------
        tech_node: The technology node in meters.
        width: The width of the read and write ports in bits.
        depth: The number of entries in the SRAM, each `width` bits.
        n_rw_ports: The number of read/write ports.
        n_banks: The number of banks.
    """

    component_name = ["smart_buffer_sram", "smartbuffer_sram", "smartbuffersram"]
    priority = 0.3

    def __init__(
        self,
        tech_node: float,
        width: int,
        depth: int,
        n_rw_ports: int = 1,
        n_banks: int = 1,
    ):
        self.sram: SRAM = SRAM(
            tech_node=tech_node,
            size=width * depth,
            width=width,
            depth=depth,
            n_rw_ports=n_rw_ports,
            n_banks=n_banks,
        )
        self.address_bits = max(math.ceil(math.log2(depth)), 1)
        self.width = width

        self.address_reg = AladdinRegister(width=self.address_bits, tech_node=tech_node)
        self.delta_reg = AladdinRegister(width=self.address_bits, tech_node=tech_node)
        self.adder = AladdinAdder(width=self.address_bits, tech_node=tech_node)

        # If there are subcomponents, we can omit area and leak_power; their cost
        # accumulates from the subcomponents.
        super().__init__(
            subcomponents=[
                self.sram,
                self.address_reg,
                self.delta_reg,
                self.adder,
            ]
        )

    @action(bits_per_action="width")
    def read(self) -> ActionCost:
        """
        Returns the cost of a read operation.

        Parameters
        ----------
        bits_per_action: int
            The number of bits to read.

        Returns
        -------
        ActionCost
            The cost of this action.
        """
        # Subcomponent costs are combined automatically: energy and latency summed,
        # throughput min'd. Returning None is equivalent to zero per-component cost.
        self.sram.read(bits_per_action=self.width)
        self.address_reg.read()
        self.delta_reg.read()
        self.adder.add()

    @action(bits_per_action="width")
    def write(self) -> ActionCost:
        """
        Returns the cost of a write operation.

        Parameters
        ----------
        bits_per_action: int
            The number of bits to write.

        Returns
        -------
        ActionCost
            The cost of this action.
        """
        self.sram.write(bits_per_action=self.width)
        self.address_reg.write()
        self.delta_reg.read()
        self.adder.add()


smartbuffer_sram = SmartBufferSRAM(
    tech_node=16e-9,
    width=32,
    depth=1024,
    n_rw_ports=1,
    n_banks=1,
)

r = smartbuffer_sram.read(bits_per_action=32)
w = smartbuffer_sram.write(bits_per_action=32)
print(f"Read energy: {r.energy} J, throughput: {r.throughput} actions/s")
print(f"Write energy: {w.energy} J, throughput: {w.throughput} actions/s")
print(f"Area: {smartbuffer_sram.area} m^2")
print(f"Leak power: {smartbuffer_sram.leak_power} W")

The latency of subcomponents is generally summed. However, if the subcomponents are pipelined for a given action, then the pipelined_subcomponents argument to the action() decorator should be set to True. This will cause the latency of the action to be the max of the latency returned and all subcomponent latencies.

Installing Models and Making them Globally Visible

An example model is provided in the notebooks/model_example directory, which can be installed with the following command:

cd notebooks/model_example
pip3 install .

The README.md file in the notebooks/model_example directory contains information on how to make models installable. Keep the following in mind while you’re changing the model:

  • The model name should be prefixed with hwcomponents_. This allows HWComponents to find the model when it is installed.

  • The __init__.py file should import all Model classes that you’d like to be visible to HWComponents.

  • If you’re iterating on an model, you can use the pip3 install -e . command to install the model in editable mode. This allows you to make changes to the model without having to reinstall it.