Accelerator Energy, Area, and Latency#

To calculate energy and latency, we first need to look at the number of actions incurred by each Component in the architecture.

Calculating Number of Actions from A Mapping#

Except for Computecomponents (whose number of compute actions, barring recomputation, depends only on workload), the number of actions incurred by most Components depends on the component type, the workload, and the mapping.

For Memory and Toll components, the number of actions depends on the number of accesses to the component. They may be accessed in two ways:

read: The component is read from a lower-level component, or output values are read up to a higher-level component.
write: The component is written to a lower-level component, or input values are written from a higher-level component.

The number of actions incurred by accesses for each tensor are equal to the number of values accessed times the bits per value of the tensor (determined by the workload), divided by the bits_per_action attribute. attribute. For example, if 1024 values are accessed with a bits per value of 16 bits and bits_per_action is 32, then 1024 * 16 / 32 = 512 actions are incurred.

Read+Modify+Writes (RMWs) to a component are counted as a read and a write. The first read of output data is skipped because the value has not been written yet.

By default, the bits_per_action attributes is set to 1, meaning that memory accesses are counted in terms of bits accessed unless bits_per_action is set to a different value.

Calculating Latency from a Pmapping#

The total latency of a component, defined in the class’s total_latency field, is a Python expression that is evaluated using the component’s actions.

The total_latency field is an expression representing the total latency of this component in seconds. This is used to calculate the latency of a given Einsum. Special variables available are the following: - min: The minimum value of all arguments to the expression. - max: The maximum value of all arguments to the expression. - sum: The sum of all arguments to the expression. - X_actions: The number of times action X is performed. For example, read_actions is the number of times the read action is performed. - X_latency: The total latency of all actions of type X. For example, read_latency is the total latency of all read actions. It is equal to the per-read latency multiplied by the number of read actions. - action2latency: A dictionary of action names to their latency. Additionally, all component attributes are availble as variables, and all other functions generally available in parsing. Note this expression is evaluated after other component attributes are evaluated. For example, the following expression calculates latency assuming that each read or write action takes 1ns: 1e-9 * (read_actions + write_actions).

Calculating Area and Leak Power#

After Component Energy and Area is completed, we can get area with the per_component_total_area and total_area attributes. Similarly, we can get leak power with the per_component_total_leak_power and total_leak_power attributes.

Accelerator Energy, Area, and Latency#

Calculating Number of Actions from A Mapping#

Calculating Latency from a Pmapping#

Calculating Area and Leak Power#

This Page