December 16, 2018

Maze Router (Lee's Algorithm)

In this post, let's talk about Maze Routing Algorithm which is a manifestation of Breadth First Search (BFS) Algorithm to find the shortest path between two nodes in a grid.

A crude version of this algorithm is also known as Lee's Algorithm. I will discuss Lee's Algorithm, and few improvements to that to improve the run time and memory.

Here's the problem statement. You need to connect the node S (source) and the node T (target or the destination) with the shortest possible path. These nodes are shown in red. The grids in blue represent a routing blockage, meaning you cannot route over these grids. You'll need to find a way around these to reach the destination node.

Problem Statement: Maze Router (Lee's Algorithm)
VLSI routes are laid orthogonal in X and Y direction. Diagonal (also known as X-routing) is usually forbidden. Let's say you need to start out from node S, you have 4 possible directions in which you can proceed:
4 possible directions from a given node
The number 1 represents the distance traveled from the source node. Once you have traveled a distance 1, here is how your grid looks like:

Grid after traveling a distance of 1 unit
Similarly, after traveling a distance of 2 units, the grid is shown below. 

Grid after 2 iterations of Lee's algorithm

Now you've hit a wall, and it will be apparent that you cannot hop over the wall (or the blockage) from the next figure. After multiple iterations of the Lee's algorithm, the grid would look something this:
Grid after 8 iterations of Lee's algorithm
Continue doing this till you hit the target or the destination node.
Grid after you hit the target node
Now you need to backtrace from the target to the source following successive lower integers to find the shortest path. Note that you may have many possible shortest choices, but all of them are guaranteed to be the shortest. Usually, there's a cost associated with turns (vias in a physical context), so practically, you may assign a weight or parameter to minimize the number of turns to choose among more than 1 possible shortest paths.
Back-tracing to find the shortest path
This embodiment of the Lee's algorithm has high complexity, especially if the grid size is higher than the one shown in the example above. Notice how much wasteful computation we had to perform over to the right. This can be minimized if we initiate the same computation from both the target and the source, and back-trace to the target and the source respectively once the two wavefronts (the one in green and one in yellow) intersect. This results in far less time complexity and much less wasteful computations.

Modification to the Lee's algorithm to start computation from both target and the source

One another possible improvement to the above algorithm is the memory required to save the distance numbers for each node. Imagine a 10x10 grid. The worst distance could be 100, and you'd require 7 bits to store numbers up to 100. That means a worst space complexity of 700 bits for 10x10 grid. For 20x20 grid, worst distance could be 400, requiring 9 bits per box and a total space complexity of 3600 bits. In order to reduce the complexity, it's possible to go only up to 3 while counting, and then counting down to 1, and so on.. Back-tracing is slightly more complicated, but it saves you a ton of space!






April 24, 2018

False Path v/s Case Analysis v/s Disable Timing

Often people have asked me the difference between set_false_path, set_case_analysis and set_disable_timing. While the difference between these three is quite easy, it's the implications that leave many designers stumped.

Let me take a shot at explaining the difference.

1. FALSE PATH: All the timing paths which designers know won't be exercised on the fly, and they don't really need to meet any timing constraints on that path can be marked as false paths.
Tools would compute delays on all arcs on the false-path, would try to meet slopes/max-fanout/max-capacitance targets for all nodes along the path, but these paths would never surface up as timing (setup and hold) violations. However, if designers are too concerned about meeting slope and max cap targets, they usually prefer to mark such paths as set_multicycle_path instead.

Some examples of false path:

Consider the circuit above. The select line of the two multiplexers is complement of each other. STA tool, however, doesn't understand this logic and would treat all nodes as X (either 0 or 1). In practice, there can never be a timing path between 

C -> E -> G
D -> F -> G

And these can be marked as false paths.

2. CASE ANALYSIS: Using set_case_analysis, any node can be constrained to a boolean logic value of 1 or 0. All case values are evaluated and propagated through the design. For example, if one input of an AND gate is 0, 0 being the controlling value, the output of AND gate would also be 0 and this 0 is propagated downstream. The timing arcs for set_case_analysis are not evaluated and they never show up in the timing reports. However, PnR tooks would still fix max transition, max capacitance and max-fanout violations on these nets/pins.

  • Some latest tool versions also support a case value of static which means that the node will always be static (never toggle), and this is used to reduce the pessimism which doing noise analysis.
  • Case analysis is also particularly useful for DFT modes where you would want to set a few configuration registers and drive the chip into a particular DFT mode: like atspeed, shift or stuck-at mode. This acts as an additional level of verification because you'd expect to see only scan chains in the shift mode with scan enable being 1. You'd expect to see functional paths in the atspeed mode with scan enable being X, and you'd expect to see only paths ending at functional register inputs in the stuck-at mode with scan enable being 0.

3. DISABLE TIMING: This disables a particular timing arc, and that timing arc or any timing path through the disabled timing arc is not computed. This tends to be a bit disruptive as compared to false paths or case analysis, but in some cases this is indispensable and the easiest way to achieve the intent. For example if you have a MUX based divider which receives the clock signal at the select line of the multiplexer, and two functional enables at the multiplexer inputs, STA tool would try to propagate the clock to the output of the MUX via the MUX select line to the output. But for a MUX, a select line only controls what gets propagated to the output. In practice, there's no arc between select and output and should be disabled.





Both case analysis and disable timing result in fewer timing paths to be analyzed. False path still tries to fix the design rule (max cap, max transition and max fanout) violations.

April 08, 2018

Leakage Power: Input Vector Dependence

Leakage Power of a standard cell depends on various transistors parameters like the channel length, threshold voltage, substrate or the body bias voltage etc. Apart from these physical parameters, leakage power also depends upon the input vector applied.

Consider a 2-input NAND gate and a 3-input NAND gate. Can you arrange the input combinations: (AB = 00, 01, 10, 11 for a 2-input NAND gate), and (ABC = 000, 001, 010, 011, 100, 101, 110, 111 for a 3-input NAND gate) in increasing order of leakage current, with a word of two about the logical reasoning behind it?

Note that the order of transistors in a stack matters here.

2-input NAND and 3-input NAND Gates