Accelerator Energy, Area, and Latency#
To calculate energy and latency, we first need to look at the number of actions incurred
by each Component in the architecture.
Calculating Number of Actions from A Mapping#
Except for Computecomponents (whose number of
compute actions, barring recomputation, depends only on workload), the number of actions
incurred by most Components depends on the
component type, the workload, and the mapping.
For Memory and
Toll components, the number of actions
depends on the number of accesses to the component. They may be accessed in two ways:
read: The component is read from a lower-level component, or output values are read up to a higher-level component.write: The component is written to a lower-level component, or input values are written from a higher-level component.
The number of actions incurred by accesses for each tensor are equal to the number of
values accessed times the bits per value of the tensor (determined by the workload),
divided by the bits_per_action
attribute. attribute. For example, if 1024 values are accessed with a bits per value of
16 bits and bits_per_action is
32, then 1024 * 16 / 32 = 512 actions are incurred.
Read+Modify+Writes (RMWs) to a component are counted as a read and a write. The first read of output data is skipped because the value has not been written yet.
By default, the bits_per_action
attributes is set to 1, meaning that memory accesses are counted in terms of bits
accessed unless bits_per_action
is set to a different value.
Calculating Latency from a Pmapping#
The total latency of a component, defined in the class’s
total_latency field, is a Python
expression that is evaluated using the component’s actions.
The total_latency field is
an expression representing the total latency of this component in seconds. This is used to calculate the latency of a given Einsum. Special variables available are the following: - min: The minimum value of all arguments to the expression. - max: The maximum value of all arguments to the expression. - sum: The sum of all arguments to the expression. - X_actions: The number of times action X is performed. For example, read_actions is the number of times the read action is performed. - X_latency: The total latency of all actions of type X. For example, read_latency is the total latency of all read actions. It is equal to the per-read latency multiplied by the number of read actions. - action2latency: A dictionary of action names to their latency. Additionally, all component attributes are availble as variables, and all other functions generally available in parsing. Note this expression is evaluated after other component attributes are evaluated. For example, the following expression calculates latency assuming that each read or write action takes 1ns: 1e-9 * (read_actions + write_actions).
Calculating Area and Leak Power#
After Component Energy and Area is completed, we can get area with the
per_component_total_area and
total_area attributes. Similarly, we can get
leak power with the
per_component_total_leak_power and
total_leak_power attributes.