VLSI SoC Design: Low Power Synthesis: Insertion of Clock Gating Cells

April 17, 2013

Low Power Synthesis: Insertion of Clock Gating Cells

Power consumption is a growing concern for modern SoCs and design engineers today face an arduous task of limiting the power dissipation of their SoCs. It would be unfair to think the backend design cycle as a magical solution to all the power solutions. However, modern synthesis EDA tools are smart enough in identifying some key RTL constructs and synthesizing a low power equivalent of the structure. We will take a look at one such RTL Construct and it's equivalent implementation for low power design.

Consider the following behavioral description:

always @ ( posedge clk )
begin
if (enable == 1'b1) then
q [15:0] <= d [15:0]
end

One logical implementation and the corresponding low power implementation of the above description would be:

The synthesis tools find such RTL constructs and try and convert it into the low power implementation shown above. Please note that, the clock gating integrated cell (CGIC) also consumes power and the above implementation might not be an expedient solution if the above enable is mostly high, or even if the number of registers in the register set is small. Therefore, one needs to exercise caution while using or implementing such a structure!

17 comments:

AnonymousApril 18, 2013 8:01 PM
Hei Palindrome,

Do you see any saving in power if we keep the flops under reset and do the clock gating?
compared to normal clock gating. May be a dump question :)

=R
ReplyDelete
Replies
UnknownApril 18, 2013 8:33 PM
No question is dumb, my dear anonymous friend! :)
Note that many gates constitute a flop, and despite that fact that flop is under reset, clock signal would be forcing some gates to go ON/OFF, though without impacting the output of the flop. By clock gating, we intend to save that power on those redundant transitions. Having said that, it also note that, using one clock gate for gating one flop might result in more power being consumed! Idea is to use one clock gate for multiple flops, then only you can expect some savings on dynamic power.
Well, to be quite honest, I strong feel that clock gating the flops (many flops in a bunch) under reset would save power, but I would like to verify the same with some simulations! Will keep you in the loop to whatever's the result! :)

Thanks for asking the question!! :)

ReplyDelete
Replies
Sunny AggarwalApril 18, 2013 10:23 PM
i guess keeping the flop into reset wont serve any purpose as we need to retain the state as well as to save the power. So clock gating without keeping flops under reset is the thing to do.
ReplyDelete
Replies
Alex WilsonApril 18, 2013 11:48 PM
Hi Palindrome,

Nice blog! I enjoy reading it on a regular basis. While I agree with Sunny that clock gating while reset asserted makes little sense, but conceptually, even I feel that clock gating (as you mentioned, many flops) while reset being asserted would save power! I am looking forward to hear your simulation result! :)

Regards,
Alex
ReplyDelete
Replies
Sunny AggarwalApril 20, 2013 10:12 PM
Hi Palindrome,
This clock gating concept, used by many EDA tools is indeed useful but i have two doubts,
1. While changing the implementation of a Designer , isn't it these tools making life more difficult while debugging?
2. Even if clock gating is implemented , data can continue to toggle and the first latch of Flop will be active without doing much. So is there any possibility to gate the data path also while clock is also gated?
ReplyDelete
Replies
UnknownApril 21, 2013 9:52 PM
Hi Sunny,

1. I only see a concern while running the LEC (Logical Equivalence Checking). Other than that, we would never require to debug. And to model this, modern EDA tools support the provision for treating all the above described MUX Based Clock Gating structures to the CGIC version.
2. Yes, data can continue to toggle, but we can't gate the data path. Consider a situation where a data path is feeding two flops, one can be clock gated, however, other cannot be. Hence, data path cannot be gated. Moreover note that if all the inputs to the data combo logic are gated, data path won't toggle anyway!! :)

Thanks!
ReplyDelete
Replies
AnonymousApril 25, 2013 6:49 PM
Thanks Palindrome for answering my question. This site is my learning place for physical design. :-D

Do you have any picture of FF with reset? if the reset is going to Master stage of flip flop then we will be saving the transitions due to data switching. In that case we will save considerable amount of power?

You have any suggestions on fan out of CGIC, number flops a clock gating cell should drive?

I assume if we gate clock, it switches off the clock tree from that point, so any idea of power saving of clock tree switching vs FF gate switching?

Thanks
=R
ReplyDelete
Replies
AnonymousJune 13, 2013 6:18 PM
Hello Palindrome,

I have few questions here, but don’t know whether I could explain to you clearly :-D

1) what will happen if the EN is driven from a flop which has async reset. Since the reset assertion is async, then EN will assert async way during reset and possible chance that EN will be delayed more than clock and we may end up getting glitches on gated clock.

2) Because of the async behaviour of the flop, I assume tool won’t be able to do the timing check for us? Rst->Q?

3) Also, what will happen if the flop goes in to metastable state and we don’t provide any clock to it for some duration of time, and start providing the clock to it in later stage. Does that flop will be in metastable state? In other words, do we need subsequent clock cycles for a flop to come out of metastable state.

Thanks,
Bond
ReplyDelete
Replies
Nik-hi-thaFebruary 03, 2014 5:52 AM
Hi Palindrome,

One question regarding the CGICs. Can these CGICs (if a lot in the design) be driven from a different power domain, I mean the Voltage supplied to the CGIC is lower than the Voltage level required for EN, and later leveling up the Voltage of EN in CGIC ON state to the voltage level required by the Design Flop.

A lower Vth CGIC will save both Static power along with the Dynamic switching power. ( Also the power cosumed in Switching of Gating cell would also be less)

Thanks for the Nice post

Regards
-Varun
ReplyDelete
Replies
AnonymousMarch 02, 2014 8:49 PM
Dear Naman,

Nice blog with useful posts. My question is what are the timing constraints on the CE (clk enable) path of an ICG that are needed at synthesis stage & placement stage.

Thanks.
-S
ReplyDelete
Replies
UnknownJuly 22, 2018 11:22 AM
Hi Naman,
please correct me if i am wrong in answering my self for the question,
1.How the tool will choose among the two possible implementations for the code???
Initially the tool replace the code with the mux based design and according to our constraints ( such as minimum no of flops and Max fanout) which of the portions of te design meets those constraints will replace with the CGIC remining will be left as it is...

Thanks
Surya

-Reply
ReplyDelete
Replies

Pages

April 17, 2013

Low Power Synthesis: Insertion of Clock Gating Cells

17 comments:

Pages