January 04, 2014

The Legend of Synchronizer

Long time ago, there were two warriors: Clock Domain 1 and Clock Domain 2. Each had their respective clocks CLK-1 and CLK-2 which they contended to be the master of their own will! The frequency and the phase relationship of the two clocks were independent of each other.


In absence of any such relationship, there is a possibility of the two clock edges being precariously close that could in turn cause a timing violation. The metastable output of the FF-2 could pass on to the entire system and disrupt the entire computation of the chip. FF-S steps in to resolve the impasse and urges the two clock domains to have a deterministic relationship between their clocks.


But the two clock domains refuse to budge! Neither of them were willing to curtail the freedom of their clocks which they deeply cherished! 

Here the FF-S sacrifices its own output and offers a solution to the clock domains.

That's how the FF-S ensured the prosperity of the entire SoC by suffering a metastability at it's own output. And since then it's been called The SYNCHRONIZER Flop! Note that it is necessary to connect the synchronizer flop's clock to the clock of the second domain.

Over the years, there's been a practice of using up to three such synchronizer flops depending upon the application.

I'll talk about the concept of Mean Time Before Failure (MTBF) which is very closely related to metastability shortly.

17 comments:

  1. HI Naman
    In the last figure what if the metastable output (M) from FF-S does not settle down to a stable value by the time it is clocked into the FF-2 which might cause the output Q2 to also go metastable.
    Do we need to introduce another synchronizing flop to counter this problem or some other solution ?

    ReplyDelete
  2. Hi Deepika,

    You have a valid point. But the use of synchronizer is all based on probability. Like I talked about MTBF (Mean Time Before Failure) which is the average time before the system could experience an error due to metastability. This MTBF is governed by the application. In critical applications, designers can choose to use 2 synchronizers (or maybe 3), which could reduce the probability of error (increase the MTBF). Theoretically, it is always possible that output Q2 could go metastable irrespective of the number of synchronizing flops used.

    I'd like to quote a practical example: A typical hard disk using two synchronizing flops has a MTBF (which is the function of the frequencies of the two clocks) of 300,000 hours. This translates into ~34 years!

    ReplyDelete
  3. Hi Naman,

    Even if we use a synchronizer flop, the output to which it settles to may not be the same as the expected next state. How is that taken care of?

    Thanks,
    Rahul

    ReplyDelete
    Replies
    1. Rahul,

      Since in this case with 2-stage synchronizer, clock domain 2 samples the data on the 3rd clock edge, it is with the assumption that metastability gets resolved in 2 clock cycles. Like I mentioned in the comment above that theoretically, metastability can linger on for ever. The designers however need to decide the number of synchronizers depending on the application and the MTBF specification. Using 3 stage synchronizer can increase the MTBF and reduce the probability of any spurious data being transmitted to the clock domain 2.

      I hope I was able to answer your query.

      Thanks,
      Naman

      Delete
    2. Hi Naman,

      I understand that the two-stage synchronizer is allowing more time for the data to settle to a stable value. My concern is that say FF1 outputs a '0' and the two-stage synchronizer settles to a '1' instead after metastability, isn't that going to be a problem as well?

      Thanks,
      Rahul

      Delete
    3. Hi Rahul,
      If you notice the diagram waveform closely, Q1 is stable for more than two cycles of Clk2. So even if M settles to 0 at the end of first cycle, it will re-sample the data on second cycle and go to the desired value. Observe that there is no setup violation on 2nd cycle since Q1 is stable. So, when FF2 samples data on third cycle of Clk2, data is stable. But this leads to another requirement: If Clk1 was twice as faster as Clk2, we would need Q1 to be stable for two cycles of Clk2 or have a different design.

      --
      Shrikant

      Delete
    4. I had the same doubt as Rahul. Thanks for the explanation Shrikanth.

      Delete
    5. This comment has been removed by the author.

      Delete
    6. Hi Rahul,
      Can write a post about asynchronous FIFO?

      Delete
    7. Hi Shrikanth,
      you explained that "Q1 is stable for more than two cycles of clk2. so even if M settles to 0 at the end of first cycle, it will re-sample the data on second cycle and go to the desired value"

      but what happens at the first sample of FF2? it will sample the wrong '0' and propagate it to the entire circuit..
      doesn't it mean that we have to set multi-cycle (of 2 periods) between FF1 and FF2?

      Delete
  4. Hi Naman,
    Its a wonderful blog.. Thanks for sharing so many infos.
    But here I also do have the same question as Shrikant did.
    Please explain asap.
    Regards,
    Harry

    ReplyDelete
    Replies
    1. Thanks, Harry! :)

      I got confused. Please post your question again as I couldn't find Shrikant's doubt.

      Delete
    2. Hi Shrikant,

      Sorry, I still don't quite get it. Referring to the example you used in responding to Rahul's question, isn't that FF2 will get the incorrect value ('0' - the metastable output from M at the end of the first cycle) at the 2nd cycle, which will then be passed on to whatever circuit connected to Q2? Appreciate shedding light on this. A diagram with waveform might help. Thanks.

      Regards,
      Nick

      Delete
  5. Naman
    What are the timing considerations after inserting the 2-flop synchronizers?
    And what if we have 2 pairs of synchronizers in a single data path... [ 2-flop syncs ------- Comparator -------- 2-flop syncs] ?

    ReplyDelete
  6. CDC never assures that data and valid is passed as expected. It only assures that no metastable value is in the system. We would need additional checkers at both sender and receiver ends to make sure what we intend is what we see at both ends. If not we might have to play with number of FF's or use another style of syncing

    ReplyDelete
    Replies
    1. CDC can also not sure there's no metastable values in system. It just reduces the probability of entering metastability. If we increase the number of synchronizers, we're effectively increasing the settling time and reducing MTBF. Thereby just reducing the probability of (or average time before) failure.

      Delete