May 15, 2013

Common Path Pessimism

Common Path Pessimism is a common source of some extra pessimism in timing analysis. Before we delve further into this, note that pessimism can be of two types: Intended and Unwanted. Intended pessimism could be like adding some extra uncertainty for clock skew before CTS stage, or some uncertainty for noise before SI (Signal Integrity) analysis. It is often prudent to have this pessimism taken upfront in your design because it will avoid any surprises when you move from one stage to another. 

Having said that, which category do you reckon should Common Path Pessimism fall? Let's define it first and then we'll take a look at it objectively.

When any pair of launching and capturing flop have a some portion of clock path as common, the difference between the max and min delay of that common clock segment is referred to as Common Path Pessimism. We discussed the rationale behind the use of timing derates briefly in the post: OCV vs PVT. Note that the entire timing analysis revolves around this intended pessimism where the basic aim is to make the timing paths more critical to avoid seeing any surprises in the silicon. EDA tools, however, themselves have quite a fair amount of pessimism, it is always prudent for the STA engineers to augment some uncertainty/pessimism in their timing analysis.

Convince yourself that:
  • Setup check would be most critical when clock reaches the launching flop late and capturing flop early; and the data path takes more delay.
  • Hold check would be most critical when clock reaches the launching flop early, capturing flop late and data path takes less delay.
Consider the following example with no common clock path and note that we have just applied the above principle to add pessimism in timing analysis.


So, while doing setup analysis, the clock tree buffers in the launching path would be derated by +5% and in the capturing path would be derated by -5&. The data path would be derated by +5%.
While doing hold analysis, it would be the opposite. The clock tree buffers in the launching path would be derated by -5% and in the capturing path would be derated by +5&. The data path would be derated by -5%.

How would the situation change when there's a common clock path? Let's take a look.
Ideally speaking, for setup analysis, we would like to take the +5% derated value of the delay of these buffers while considering launching path and -5% derated value while considering the capture path. However, here lies the catch! How can the same buffer or set of buffers be derated differently for launch and capture? Recall from the definition of OCV that it is the intra-chip variation in PVT that STA engineers consider them in the first place.

However, now these buffers, they are in the same location. So at a time they would behave in a similar manner. It does not make sense to consider different delays for same buffers. And this is the origin of common path pessimism and in usually unwanted. What we can do is (or rather what EDA tools tend to do is), do the calculation considering common path to be non-existent. And in the slack, add the double derated value of the common buffers, which would be 10% of the three common buffers in this case. This is referred to as Common Path Pessimism Removal.

16 comments:

  1. i tried to post my query at "post your query" place but i think its not working.. so i am posting here only...
    ""Let us assume that EN, D1, D2 and CK are the inputs and Q1, Q2 are the outputs for a module.There are two D-flips flops in the circuit.
    Flop1 : CK1 is the clock and D1 is the ‘D’ input.
    Flop2 : CK2 is the clock and D2 is the ‘D’ input.
    If EN=1 then CK1 should get the CK and if EN=0 then CK2 should get the CK.
    Note: The cycle time and duty cycle for all the clocks CK, CK1 and CK2 has to be same""... actually i need verilog code for this!! but if i get design then i can write verilog code...

    ReplyDelete
    Replies
    1. I reckon, you are looking for a circuit which goes like this:
      Flop 1 will have a 2-input AND gate at its clock input CK1, with one input being EN, and other being CK.
      Flop 2 will have another 2-input AND gate at its clock input CK2, with one input being !EN and other being CK.

      Hope I was able to answer it correctly!! :)

      Delete
    2. i think u missed out "IF" condition.. according to me, there will be a demux with ck input and en as a select signal. that will generate ck1, ck2 ok???
      now my question is how i can get cycle time and duty cycle same??
      Note: The cycle time and duty cycle for all the clocks CK, CK1 and CK2 has to be same"".

      Delete
    3. You can't simply use a DEMUX here. Because you need to retain the clock signal CK as well. Using the above solution, I strongly feel that the cycle time and duty cycle for all three clocks would be the same.

      Maybe I didn't understand your question correctly. Why do you think the above solution wouldn't work? It might help to know the problem statement better through a figure.

      Delete
    4. but for selecting CK1 and CK2 i need to use DEMUX right???? because CK is the main clock that is connecting to CK1 and CK2 , by the EN pin.
      i have drawn a design but i am able to post here.. u give me ur email id or other way to show u design.??

      Delete
    5. Please mail me the design at my.personal.log@gmail.com

      Delete
  2. Hi,

    How about crpr removal in case of SI ananysi specially for setup ? We dont want remove it for SI setup or how EDA tools handle it.

    ReplyDelete
    Replies
    1. You brought up an interesting point here which could be a veritable topic for another blog post.

      To summarize, noise is basically the effect of the activity on the nets in the immediate vicinity of the net in question. Since setup is a next cycle check while hold is the same cycle check, noise may be different during the two different clock edges. While it would be the same for hold. Hence, to eliminate the extra uncertainty due to noise, we remove it for hold.

      Modern EDA tools have switches with possible values as true/false, which the STA engineers can set as per their needs.

      Thanks!

      Delete
  3. Hi ..
    Please correct me if I am wrong ... for calculating the setup slack the cross-talk value is added in the common path and to the datapath cell delay and cross-talk value is subtracted in the common path and to the clock path cell delay ...

    For SI ..
    Only difference in hold slack calculation is that the CPPR is added ... But CRPR is not considered for the setup slack calculation...

    Please tell me whether I am correct or not ?

    ReplyDelete
  4. Hi Ajay.

    For setup analysis, the cross talk in the common path would be added for launch, and subtracted from capture. And cross-talk would always be added to the data path cell delay.

    For hold analysis, cross-talk in the common clock path is an unwanted pessimism and STA Team can use their discretion on how they wish to use it. But cross-talk will always be subtracted from the data-path cell delay.

    CRPR: Clock Reconvergence Pessimism is dependent on the architecture. If architecture proscribes such a scenario, we need not consider CRPR at all. However, it such a case is plausible, one only need to take it into account for setup. Because setup being a next cycle check, it is possible, that launch is from the longest clock path and capture is across the shortest clock path or vice-versa. However, hold being a same cycle check, CRPR is never a valid case and one must not take it into account.

    I hope I understood your question correctly. Please feel free to point out if there's still any confusion!

    Thanks!

    ReplyDelete
    Replies
    1. Hi Naman,
      I understood the concept what you told.. Here I want to explain with an example to you about the understanding ..

      If there is a path from ff1 to ff2 and there is a buffer common to this path (having delay of 100ps) and if a cross talk of 20ps delta is affecting that buffer ...

      CRPR = 120 - 80 = 40ps

      For Setup calculation: data_req - data_arrival
      => (clock_period + 80 - setuptime of ff2 + CRPR) - (120 + ff1clk-qdel + datapath delay)
      => (clock_period + 80 - setuptime of ff2 + 40) - (120 + ff1clk-qdel + datapath delay)

      For Hold calculation : data_arrival - data_req
      => (80 + ff1clk-qdel + datapath delay) - ( 120 + hold-time of ff2 + CRPR)

      Here for Hold CRPR factor is not considered ... isn't it ?

      Please let me know whether my understanding is correct or not ?

      thanking you
      Ajay


      Delete
    2. Hi Ajay,

      I got confounded by the numbers and found it difficult to grasp these numbers in absence of any figure underlying the text. Would it be possible for you to elucidate it more with respect to a figure?

      I'll try and read it again, before getting back to you.

      Thanks!

      Delete
  5. Hi Naman,
    I tried to put the figure but unable to put the fig.. Its a simple flop to flop ckt with a buffer in a common clock path.. the buffer is in the clock path common to ff1 and ff2..
    Other things are as explained.

    thanking you
    Ajay

    ReplyDelete
  6. Suppose I have STA run for a particular corner with me. If i do any incremental change in the data path(some kind of optimization). Clock path not effected in any way. Can the data path modification effect my CPPR

    ReplyDelete
  7. why didnt you apply derates for flops?
    my point is the cells or flops are derived from library which means flop is also from libs and there are slow and fast delays for the flops , are thise to be considered?

    ReplyDelete