August 22, 2012

Routing: Basics

Routing process determines the precise paths for nets on the chip layout to interconnect the pins on the circuit blocks. Before discussing further, it would be prudent to discuss where does Routing actually fit in the Physical Design flow.

After Synthesis (the conversion of RTL to gate-level netlist), the blocks and the instances are Placed, which, to some extent, is governed by the Floorplan. After Placement, Clock Tree is synthesized followed by Routing of the signal nets. The following flow chart summarizes the Physical Design Flow.



Objectives of the Routing Process:
  • To determine the necessary wiring, e.g., net topologies and specific routing segments, to connect these cells while respecting constraints like design rules.
  • To Optimize routing objectives, e.g., minimizing total wire length and maximizing timing slack.

Routing is further divided into many subtypes:
  • Global Routing: It defines the routing regions and generates a tentative route for each net. Each net is assigned to a set of routing regions. However, it does not specify the actual layout of wires and it not sensitive to DRV violations.
  • Detailed Routing: For each routing region (defined during Global Routing), each net passing through that region is assigned to particular routing tracks. The actual layout of wires is specified. It also tries to fix all DRV violations in the design.

August 19, 2012

Puzzle: Divide by 3 Counter with 50% DC

It is pretty simple to make a clock divider with odd frequency division (let's say 3 or 5). But it doesn't have 50% duty cycle. Some modifications are essential to achieve that 50% duty cycle. You might argue, why so much fuss about 50%? To give you an insight into it, consider the following divided waveform with 66% DC:

As you can note from the above waveforms: 
  • NEG-TO-POS arc (i.e. any path launching from a negative edge triggered flop and being captured at positive edge triggered flop) would have least time to meet the setup time requirement and hence can be critical. 
  • On the other hand, POS-TO-POS and NEG-TO-NEG are so much relaxed. 
Same would be true for a divider with 33% duty cycle as well. So, it is preferable to use a divided clock with 50% duty cycle.

Can you design such a circuit which takes a clock signal of frequency f, and outputs another clock signal of frequency f/3 with 50% duty cycle?

Power Gating

Power Gating is yet another effective implementation employed in Low Power Designs. Unlike Clock Gating, which saves the dynamic power, Power Gating saves the leakage power. As we move from micron (i.e. greater than 90nm) technology nodes to sub-micron (i.e. less than 90nm) technology nodes, leakage power dissipation dominates the dynamic power dissipation. It is therefore employed very frequently in modern SoCs. We shall talk about the structure of a power gate in this post.

Consider any CMOS digital logic circuit consisting of Pull-Up Network (made from PMOS transistors) and Pull-Down Network (made from NMOS transistors), as shown in the figure. 

At any point of time, if a direct path would exist from the power supply (VDD) and the ground (GND), the circuit would continue to dissipate leakage power. What is the possible turnaround? One can gate the power and ground terminals from the circuit when it is not intended to be used. That's what is accomplished by a power gate! Let's take a look at the circuit.

  • During normal operation, SLEEP = 0. Both the PMOS and NMOS Sleep Transistors (in blue and green respectively) are ON. And we have Virtual Power Rails and Virtual Ground which ensure normal circuit operation. 
  • However, during periods of low activity, SLEEP = 1. Hence the Sleep Transistors turn OFF. And a direct path from power rails to ground is broken and hence no leakage power is dissipated due to the Pull-up and Pull-down networks.

Note that, during normal operation, Sleep Transistors contribute to some extra leakage power because they are still ON. Though, the leakage power due to these two transistors would be extremely small compared to that of the Pull-up and Pull-down networks, nevertheless, these transistors are custom designed in a way such that they have high Vt (the threshold voltage) to reduce any excess leakage power during normal mode of operation. 


August 17, 2012

Puzzle: Identify the Issue with Circuit Topology

With the symbols having their usual meaning, identify the issue with this circuit topology.

[Hint]: Think from the timing perspective, and not the functional perspective.

You may answer the following:
  • Issue with the topology.
  • And in which kind of timing violation will the issue manifest itself while timing analysis.
  • Possible modification(s) to solve the issue.

Clock Gating Integrated Cell

In the post, Clock Gating, we discussed the need for Clock Gating for Low Power Design Implementation. Clock being the highest frequency toggling signal contributes maximum towards the dynamic power consumption in the SoC even when the flops that are being fed by the clock are not changing their state. So, it is practical to gate the clock from reaching the set of registers or maybe some block in a design to save on the dynamic power consumption.

 You can relate it to the Standy mode in your PCs. In standy mode, only a sub-system of your entire SoC is working. Hence to save on the power consumption, one can employ clock gating. (Or maybe some other power saving methods, that we will discuss later).

Instead of using an AND or an OR gate for clock gating which is vulnerable to glitchy output, design engineers prefer to use the Clock Gating Integrated Cell (CGIC) to completely obviate the problem. Here's the circuit of a CGIC:


As evident from the above waveforms, if enable EN of the CGIC is logic-1, CGIC passes on the clock at the output without any glitch. And if EN is at logic-0, the outpiut is gated, i.e. no clock at the output and hence saving on the dynamic power consumption in the device.

August 16, 2012

Passing Arguments to an User-Defined Proc in Tcl

Passing arguments to any procedure is a very common scripting style. In this post we will discuss two ways one can pass arguments to any procedure in Tcl.

Just like the last post, Reading from and writing to a file in Tcl, the text in black represents tcl commands. Text in red represents user-defined variables.

The conventional way to pass arguments to a user-defined procedure (palindrome, in this case) in Tcl is:

proc palindrome { arg1 arg2 } {
<body of the proc using $arg1 and $arg2>
}

The above proc is then called like: palindrome <arg1> <arg2>

The order of passing these arguments need to be maintained.
The above approach works fine. However, when the number of arguments is large, one needs to remember the relevance of each argument and their order of occurrence.

Here's a more versatile way of passing the arguments:

 proc palindrome { args } {
    variable args_list {}
    set valid_switches [list "-description_arg1" "-description_arg2"]
    parse_proc_args $args $valid_switches palindrome

    set arg1           [cdr [assoc "description_arg1"          $args_list]]
    set arg2           [cdr [assoc "description_arg2"          $args_list]]
<body of the proc using $arg1 and $arg2>
}

The above proc is called like: palindrome -description_arg1 <arg1> -description_arg2 <arg2>


Here you can always associate the "description_arg1" keyword with arg1 and so on. Moreover, it is not necessary to call arg1 before arg2.
You can also use: palindrome -description_arg2 <arg2> -description_arg1 <arg1>
  All you gotta make sure is that you associate the arguments with the corresponding description switch!

July 28, 2012

Design for Testability: The Need for Modern VLSI Design

DFT is the acronym for Design for Testability. DFT is an important branch of VLSI design and in crude terms, it involves putting up a test structure on the chip itself to later assist while testing the device for various defects before shipping the part to the customer.

Have you ever wondered how the size of electronic devices is shrinking? Mobile phone used to be big and heavy with basic minimal features back in 90s. But nowadays, we have sleek phones, lighter in weight and with all sorts of features from camera, bluetooth, music player and not to forget with faster processors. All that's possible because of the scaling of technology nodes. Technology node refers to the channel length of the transistors which form the constituents of your device. Well, we are moving to reduced channel lengths. Some companies are working on technology nodes as small as 18nm. Smaller is the channel length, more difficult it is for the foundries to manufacture. And more are the chances of manufacturing faults.

Possible manufacturing faults are: Opens and shorts.
The figure shows two metal lines one of which got "open" while other got "shorted". As we are moving to lower technology nodes, not only the device size is shrinking but that also enables to pack more transistors on the same chip and hence density is increasing. And manufacturing faults have become therefore indispensable. DFT techniques enable us to test these (other kinds as well) faults.

July 27, 2012

Electrostatic Discharge vs Electromigration

Electrostatic Discharge and Electromigration might sound similar, but refer to different physical phenomenon. I would try to explain the difference between the two.

Electrostatic Discharge (ESD) is the large amount of current flow between any two points when a large (usually momentarily) potential difference is applied across those two points. In semiconductor terms, let's say you by some means a large potential is applied on the Gate of the MOS device, then a large current tend to flow through the gate and this in turn may disrupt the Silicon dioxide of the transistor. As you are aware this this silicon dioxide controls the important parameters like the threshold voltage (Vt) of the transistor, any physical damage would render the functionality of the entire device capricious.

To give you a general perspective: 
  • The semiconductor industry incurs losses worth millions of dollars just due to ESD and therefore while shipping the parts, each and every IC is packed with utmost care and insulated from the outside world. 
  • Also, while working in the labs in research centers or universities, or corporate, care is taken to obviate any excess potential from getting accumulated on any lab material. There's a separate ground for every device, which may be as small as a metallic needle. Even back in my college days, our professor used to admonish us for touching the pins of any IC with bare hands because sufficient potential can get accumulated on our body, specially, our extremities.
Note that ESD is a single time event. It can occur maybe while shipping, maybe while you are beginning to use the device or maybe when you are using that device.

Electromigration (EM): Let's say a device is operating over a long period of time. And there are certain regions in the device, where the current density is pretty high. These electrons have the propensity to displace the atoms of the device and this might create voids in certain regions and hillocks in other regions.

July 26, 2012

Sample Problem on Setup and Hold

In the post Timing: Basics, we discussed about the basics of setup and hold times. Why is it necessary to meet the setup and hold timing requirements. And how frequency affects setup but does not affect hold.

Let us understand the concept with an example:


I hope the above waveforms are self explanatory.
Setup Slack in the above case (as inferred from the waveforms as well) is:

Setup Slack = Tclk - T(clk-2-q) - Tdata - T(su,FF2)

If this setup slack is positive, we say that setup time constraint is met. Note that setup slack depends upon the clock period and hence in turn frequency at which your design is clocked.

Let us consider hold timing:
Hold Slack = Tdata + T(clk-2-q) - T(ho,FF2)

As evident from the above equation, hold slack is independent of the frequency of the design.

Note:
  • Setup is the next cycle check, we would take the setup time T(su,FF2) of FF2 into account while finding setup slack at input pin of FF2.
  • Hold time is the same cycle check, we would take the hold time T(ho,FF2) of FF2 into account while computing the hold slack at input pin of FF2.
Try and grasp this example. I shall introduce the concept of clock skew next.

July 14, 2012

Clock Gating

Clock signal is the highest frequency toggling signal in any SoC. As we discussed in the post: Need for Low-Power Design Methodology, the capacitive load power component of the dynamic power is directly proportional to the switching frequency of the devices. This implies that clock path cells would contribute maximum to the dynamic power consumption in the SoC. 

Power consumption in the clock paths alone contribute to more than 50% of the total dynamic power consumed within modern SoCs. Power being a very critical aspect of the design, one needs to make prudent efforts to reduce this. Clock Gating is one such method. 

Let's try and build further on this perspective.
Clock feeds the CLOCK pins all the Flip-Flops in the design. Clock Tree itself comprises of clock tree buffers which are needed to maintain a sharp slew (numerically small) in the clock path. Refer to the post Clock Transition for details. 


Consider the above figure. It is not necessary that the output of the flip-flop would be switching at all times. Modern devices support various low-power modes in which only a certain part of your SoC is working. This may include some key features pertaining to security or some critical functional aspects of your device. Apart from this, there are some configuration registers in your device which need to be programmed either once or very seldom. So, let's say, the above FF will not be switching states for a considerable period of time. If it is used the way it is, what's the problem? Power! Clock is switching incessantly. Clock Tree buffers are switching states and hence consuming power. So are the FFs. Remember that FF itself is made up of latches. So, despite the fact that input and output of the FF is not switching, some part of the latch is switching and consuming power.

What could be done to alleviate the above problem? Clock Gating is one such solution. Here's how it'll help.


If you place an AND gate at the clock path and knowing that you don't need a certain part of your device to receive clock, drive a logic '0' on the ENABLE pin. This would ensure that all the Clock Tree buffers and the sink pin of the FF are held at a constant value (0 in this case). Hence these cells would not contribute to dynamic power dissipation. However, they would still consume leakage power.

Similarly, you can place an OR gate and drive it's one input to logic 1. Again, you would save on the dynamic power.

However, a word of caution. The output of the AND gate feeding the entire clock path might be glitchy. See the following figure:

Solution: The output won't be glitchy if the enable signal changes only when the CLOCK signal is low. So, all you gotta make sure is that ENABLE is generated by a negative-edge triggered FF. This would ensure that the signal is changing after the fall edge of the CLOCK signal.

Similarly, while using an OR-gate, clock pulse would be propagated if the ENABLE signal changes when the CLOCK is high. Make sure that it is generated by a positive-edge triggered FF in order to avoid any glitch being passed onto the FFs. 


Why would a glitch be detrimental anyway? The answer is:
Glitches constitute an edge! FF might sample the value because they are edge-triggered. But, problem is that all FFs have a certain duty cycle requirement (Also called Pulse-width check), which needs to be fulfilled in order to ensure that they don't go into METASTABILITY. And if an unknown state : X is propagated in a design, the entire functionality of the chip can go haywire!

Some terminologies: 
  • AND/NAND gate based clock gating is referred to as Active-High Clock Gating.
  • OR/NOR gate based clock gating is referred to as Active-Low Clock Gating.
NAND and NOR clock gates work similar to AND and OR respectively.

So, Clock Gating is an efficient solution to save dynamic power consumption in the design. Modern SoCs have many IPs integrated together. Placing a clock gate and enabling them in various possible combinations is what gives rise to different low-power modes in the device.

July 10, 2012

Puzzle: Finite State Machine

I loved solving problems on Finite State Machines back in college days. Recently, I came across a good problem and thought it would be expedient to share it for you as well!

Q. The ACME Company has recently received an order from a Mr. Wiley E. Coyote for their all-digital Perfectly Perplexing Padlock. The P3 has two buttons ("0" and "1") that when pressed cause the FSM controlling the lock to advance to a new state. In addition to advancing the FSM, each button press is encoded on the B signal (B=0 for button "0", B=1 for button "1"). The padlock unlocks when the FSM sets the UNLOCK output signal to 1, which it does whenever the last N button presses correspond to the N-digit combination.
  1. Unfortunately the design notes for the P3 are incomplete. Using the specification above and clues gleaned from the partially completed diagrams below fill in the information that is missing from the state transition diagram with its accompanying truth table. When done :
    • Each state in the transition diagram should be assigned a 2-bit state name S1S0 (note that in this design the state name is not derived from the combination that opens the lock),
    • The arcs leaving each state should be mutually exclusive and collectively exhaustive,
    • The value for UNLOCK should be specified for each state, and
    • The truth table should be completed.
    •  What is the combination for the lock?


     
    Source: MIT Course Ware 

July 07, 2012

Timing: Basics

In a few earlier posts, we have already mentioned timing. It's time to discuss it formally.
Timing is a constraint that must be met so that the design functions the way it was meant to.

  • What will happen if the timing constraints are met?
    You can be pretty sure that the device will function correctly at the frequency that was intended.
  • What will happen if the timing constraints are not met?
    Device will not function correctly at the intended frequency. And it might or might not function at a slower frequency.
Pretty confusing? Don't worry. Read on.

Consider the following digital circuit. Two rise-edge triggered flops a and b, fed by a clock signal CLK, talking to each other. Output of Flop a after being processed by combinatorial logic Comb is reaching the input of Flop b.

How does the above circuit work? Consider the two waveforms which are the clock signals at flop a and b respectively. Flop a samples the input data IN at rising clock edge 1a and this data is captured by Flop b at the clock edge 2b. Similarly, data sampled and launched by the flop a at clock edge 2a is captured by flop b at 3b. 

As long as this launching and capturing relationship is maintained correctly, our timing constraint is also met and device would function perfectly fine! But the question: What actually is this timing constraint?

The data launched at edge 1a has to do undergo the following delays before it reaches the input of flop b.
Clock to q delay of Flop a and delay of the combinatorial logic Comb. 
And it should reach the input of flop b a at least some time before the edge 2b reaches the clock pin of Flop b. This time is called Setup Time. 
Also, we have to make sure that the data launched by Flop a at clock edge 1a is not captured by Flop b at clock edge 1b (it needs to be captured at 2b). So, the data must reach the flop b at least some time after clock edge 1b reaches Flop b. This time is called Hold Time.


Read the above two lines again. 
Same would be the relationship for other edges. Setup checks: 2a-3b; 3a-4b. Hold checks: 2a-2b; 3a-3b and so on.
Setup and Hold are the bread and butter of every backend design engineer. But why should the data reach some time before or after some clock edge? Where do these times come from? What exactly is the origin of setup and hold times? I do not mean any disrespect, but the answer to this question can puzzle even an experienced design engineer and I assure you that we will take this up in detail very soon.

For now, convince yourself that:
  • Setup is a next-cycle check while hold is the same cycle check.
  • Setup is dependent on the period (and hence frequency) at which your flip-flops are clocked while hold checks are frequency independent.
A direct ramification of the above statement is that setup violations can be fixed by lowering the operating frequency of the design. But hold violations cannot be fixed that way! I shall explain the Origin of Setup and Hold times soon. Also, I would like to take up some examples that would corroborate the concepts that I explained in this post.


July 06, 2012

Need for Low-Power Design Methodology

Low Power is the today's need in VLSI. Why? Well, ask yourself ! You go to gadget shop, looking for a new cell-phone. Apart from the price, what are the qualitative things that you would be most concerned about? 
  • Features including the speed of the processor.
  • Battery back-up
  • Operating System
A good Operating system can make an efficient use of the system's hardware resources but is more driven by the software applications that you wish to run. However, the first two are directly influenced by the design methodology and the technology node that goes behind designing your device.

You would love to buy a cell-phone with a faster processor to enable you to have your applications run fast, your computations quicker. Also, you wouldn't want to charge your cell-phone every hour. Or for that matter everyday! This would translate into a design challenge to have your device to consume least power.

Frequency and power go hand-in-hand. You cannot just go on increasing the frequency (assuming that timing is met!), without expecting any hit on power. 

Power, itself has many components. To just give you a glimpse, we'll talk about the components of power in brief.
Power dissipated has two components: Dynamic and Static.

 
 Dynamic power constitute that component of total power which comes into picture when the devices (the individual transistors) switch their values from either 0 to 1 or vice-versa. Dynamic power itself has two components: 
  • Capacitive Load Power: Depends on the output load of each transistor switching states.
  • Short Circuit Power: Depends on the input transition.



 Static Power is the component which is dissipated when the device is not switching i.e. it is in standby mode and mainly constitutes of leakage power.

We talked about the fact that Power and speed of the device go hand-in-hand. It is pretty much evident from the above equation. As you tend to increase the frequency of your design (again emphasizing that timing must be met!), the switching rate of the devices would increase and hence capacitive load component of the dynamic power would increase.

One turn-around to reduce power is to reduce the voltage supply at which your devices are working. But this, in turn, will reduce the signal swing available for the devices to cross the threshold voltage ( Vt ) and hence would engender myriad design challenges.

Before I conclude this post, I would like to make one last point. The device complexity is increasing every day and the device size is shrinking. This ensures that your latest cell-phone is sleek in its look but again, hit is on Power!


 The above table shows the trend of ever-increasing power dissipation with scaling down of technology nodes. This has forced the designers to come up with innovative design solutions to deliver the best to you.
In upcoming posts, we will discuss these design-for-low-power solutions in detail



References:
[1] Low Power Methodology Manual: For System on Chip Design by Michael Keating, David Flynn, Robert Aitken, Alan Glibbons and Kaijian Shi.

Puzzle: Clock Transition

In the post Factors Affecting Delays of Standard Cells, we talked about the clock transition and the way it impacts setup and hold times.

While building our clock tree we ensure that clock transition is as low as possible. 

If clock transition or the slew at clock tree buffers were bad, apart from the penalty on hold time, what other deteriorating impact would it have on the design?

July 03, 2012

Factors Affecting Delays of Standard Cells

In this post, we would talk about the factors that affect the delays of standard cells. Before starting with the discussion, it would be prudent to discuss what is meant by Timing Arcs:

Timing Arcs: A timing arc represents the direction of the signal flow from usually an input to an output. They may be combinational or sequentialCombinational arcs represent the signal flow in combinatorial cells like AND, NAND, OR gates. Sequential arcs represents the signal flow in Flip Flops and they usually have a control signal like CLOCK associated with them. Third type that is closely related to sequential arcs are the setup and hold arcs. They represent the setup and hold requirements and in general, do not represent any signal flow. 


The information about these timing arcs come from the timing library (.lib) files.


Let's turn our attention back to delays.

Consider an AND gate. As discussed above, A to Z is a combinational timing arc. The delay of this arc is picked up from the .lib. This .lib is then read by the timing tools in timing reports.

This delay depends on primarily 2 factors:
1. The input slew or the transition at A pin.
2. The output load or the capacitance at the Z pin.

Note that the output load is the sum total of the input capacitance of the cells connected to the node Z and also the net capacitance of all such nodes.

Output Load = Input Cap of all cells at the fan-out of Z + Total net capacitance of the nets connected to node Z.


Delay is directly proportional to the input transition and the output load.
1. More is the output cap, more time the cell would require to charge/discharge that capacitance. And hence,  delays would be more.
2. More is the input transition, more time the cell would require to change the output after processing the input value.

You would note that explanation behind delays just boil down to charging/discharging of the capacitors!! Once you befriend them, you would be able to deduce half the concepts intuitively. 

We are now set to discuss the delays of timing arcs of a flip-flop.

1. Clock-to-Q delay: As expected, it depends upon the clock transition and the load at the output Q. It may sound surprising, but clock-to-q delay does not depend upon the transition at the D input.
2. Setup and Hold time: Setup and Hold time depend upon the transition value at clock pin and transition value at D pin. It does not depend on the output load.

Some surprises might be yet to unfold. Read on.
1. Clock-to-q delay is directly proportional to the clock transition and the output cap at Q.
2. Setup time is directly proportional to input transition at D and inversely proportional to the clock transition. Recall the definition of setup time. More is the clock transition time, more time you are allowing for the input at D to settle setup-time before the clock transition.
3. Hold time is inversely proportional to input transition at D and directly proportional to the clock transition. Again, recall the definition of hold time. More is the clock transition time, greater is the possibility that the D input might change in the hold window after clock transition.
I hope I was able to explain this stuff clearly. In case of any doubts, please feel free to post them here.


July 01, 2012

Reading from and writing to a file in Tcl

File handling operations like reading from and writing to a file in any programming language is one of the most commonly used operations.

Writing to a file in Tcl is straightforward. 
In this post, we will discuss two ways to read a file in Tcl:

Note that the text in black represents tcl commands. Text in red represents user-defined variables and comments are in blue.

  1. set in_file [open palindrome_in.csv r]          ## Opens the file palindrome_in.csv in read mode
    set out_file [open palindrome_out.csv w]    ## Opens the file palindrome_out.csv in write mode
    set data [read $in_file]                                   ## "data" now has contents of the input file
    set lines [split $data "\n"]                             ## "lines" now contain collection of lines
    foreach line $lines {                                     ## Reading each "line" from collection of "lines"
    <body of your proc>                                    ## Body of the proc
    }
    puts $out_file "xyz"                                       ## Printing the desired output in palindrome_out.csv
    close $out_file                                              ## Closing the output file
    close $in_file                                                 ## Closing the input file


    Note: Closing both the input and output files is important. If not done, your Tcl shell might return an error "too many open files". Or even worse, the output in the output file might get terminated pre-maturely.
  2. set in_file [open palindrome_in.csv r]            ## Opens the file palindrome_in.csv in read mode
    set out_file [open palindrome_out.csv w]      ## Opens the file palindrome_out.csv in write mode
    while { [gets $in_file line] >= 0 } {                   ## Note that here "line" is not a user-defined variable
    <
    body of your proc>                                       ## Body of the proc
    }
    puts $out_file "xyz"                                       ## Printing the desired output in palindrome_out.csv
    close $out_file                                              ## Closing the output file
    close $in_file                                                 ## Closing the input file
    What's the difference between the two? Well, not much, if the size of your input file in small. However, if it is a big file (for example: SDF files, where the file maybe as big as 1 GB!!), you might prefer using the second method.

    In the first method, the user-defined variable lines contain all the lines of the file in form of a collection. If file is too big, this collection would be too big and one variable will have to hold this fairly big data till your script is working. This might lead to "stack-overflow error".
    This problem is alleviated in the second way where you are reading each line on the go.

June 29, 2012

Puzzle: Multiplexer Trees

Are you comfortable making multiplexer trees? If you are, then try this one:

How many 2:1 MUXes you would require to make any n:1 MUX, where n is any integer greater than or equal to 2.



Answer is n-1.

Solution:

By principle of mathematical induction, we can say that for any n:1 MUX, we would require (n-1) 2:1 MUXes.



PVTs and How They Impact Timing

PVT is acronym for Process-Voltage-Temperature.

PVTs model variations in Process, Voltage and Temperature. There's other term OCV which refers to On-Chip Variation. PVTs model inter-chip variations while OCVs model intra-chip variations
We'll talk about OCVs in some other post.

Let's talk about PVTs in detail:

1) Process: You must have heard people talking in terms of process values like 90nm, 65nm, 45nm and other technology nodes. These values are characteristic of any technology and represents the length between the Source and Drain of a MOS transistor that you might have studied in your under-grad courses. While manufacturing any die, it has been seen that the dies that are present at the center are pretty accurate in their process values. But the ones lying on the periphery tend to deviate from this process value. The deviation is not big, but can have significant impact on timing.
Recall from your undergrad courses the following formula for current flowing in a MOS transistor:

                                             
L represents the process value. For same temperature and voltage values, current for 45nm process would be more than current for 65nm process.
More is the current, faster is the charging/discharging of capacitors. And this means, delays are less.


2) Voltage: The voltage that any semiconductor chip works upon is given from outside. Recall while working on breadboards in your labs, you used to connect a 5V supply to the Vcc pin of your IC. Modern chips work on very less voltage than that. Typically around 1V-1.2V.
This voltage must be the output of either a DC source or maybe the output of some voltage regulator. The output voltage of voltage regulator might not be a constant over a period of time. Let's say, you expected your voltage regulator to give 1.2V, but after 4 years, it's voltage dropped down to 1.08V or increased up to 1.32V. So, you gotta make sure your chip is working well between 1.08 and 1.32V!!
This is where the need to model Voltage variations come into picture.
From the same equation as above, it can be seen that more is the voltage, more is the current. And hence, delays are less.


3) Temperature: The ambient temperature also impacts the timing. Let's say you are working on a gadget in Siachen glacier where temperature can drop down to -40 degrees centigrade in winters and you expect your device to be working fine. Or maybe you are in Sahara desert, where ambient temperature is +50 degrees and your car engine temperature is +150 degrees and again you expect your chip to working fine. While designing, therefore, STA engineers need to make sure that their chip will function correctly in the temperatures between -40 to +150 degrees.

Higher is the temperature, more is the collision rate of electrons within the device. This increased collision rate forbids other electrons in the periphery to move. Since electron movement is responsible for current flowing in the device, current would decrease with increase in temperature. Therefore, delays are normally more at higher temperatures.


For technology nodes below 65nm, there's a phenomenon called TEMPERATURE INVERSION, where delays tend to increase with decreasing temperature. We shall talk about the same later. Don't get confused with it here.

WORST PVT: Process worst-Voltage min- Temperature-max
BEST PVT:  Process best-Voltage max- Temperature-min
WORST COLD PVT: Process worst-Voltage min-Temperature min
BEST HOT: Process best-Voltage max-Temperature max

STA engineers are responsible for closing the timing ( i.e. setup and hold ) at all these PVT corners.
So, next time you hear an STA engineer cribbing about his timing status across multiple PVTs, please show him some empathy!

Some related topics that we would discuss in upcoming posts:
1) On-Chip Variations and how they differ from PVT.
2) Temperature Inversion.
3) Factors affecting delays of standard cells.

Stay tuned for updates.