What s interesting is that AFAICT from my mpw 1 design 1 the open-source-silicon.dev #mpw-one-silicon

What's interesting is that AFAICT from my mpw-1 de...

tnt

10/18/2021, 6:29 PM

What's interesting is that AFAICT from my mpw-1 design (1) the tree is "balanced" in the sense there is the same number of buffers in each branch of the hold violations. However the different parasitics of each branch made it unbalanced delay wise ... and (2) there was some SPEF extraction so at least "some" parasitics were extracted from the design.

tnt

10/18/2021, 6:33 PM

For instance :

tnt

10/18/2021, 6:33 PM

Copy code

Startpoint: _19752_ (rising edge-triggered flip-flop clocked by wb_clk_i)
Endpoint: _19222_ (rising edge-triggered flip-flop clocked by wb_clk_i)
Path Group: wb_clk_i
Path Type: min

  Delay    Time   Description
---------------------------------------------------------
   0.00    0.00   clock wb_clk_i (rise edge)
   0.00    0.00   clock source latency
   0.01    0.01 ^ wb_clk_i (in)
   0.08    0.09 ^ clkbuf_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_16)
   0.11    0.19 ^ clkbuf_1_0_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.09    0.28 ^ clkbuf_1_0_1_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.08    0.36 ^ clkbuf_1_0_2_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.10    0.46 ^ clkbuf_1_0_3_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.20    0.66 ^ clkbuf_2_0_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.09    0.75 ^ clkbuf_3_1_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.10    0.85 ^ clkbuf_3_1_1_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.13    0.99 ^ clkbuf_4_3_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.09    1.08 ^ clkbuf_5_7_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.09    1.17 ^ clkbuf_6_15_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.16    1.33 ^ clkbuf_7_31_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.00    1.33 ^ _19752_/CLK (sky130_fd_sc_hd__dfxtp_4)
   0.25    1.58 v _19752_/Q (sky130_fd_sc_hd__dfxtp_4)
   0.06    1.64 ^ _11426_/Y (sky130_fd_sc_hd__nor2_4)
   0.09    1.73 ^ _11427_/X (sky130_fd_sc_hd__a211o_4)
   0.02    1.75 v _11428_/Y (sky130_fd_sc_hd__inv_2)
   0.00    1.75 v _19222_/D (sky130_fd_sc_hd__dfxtp_4)
           1.75   data arrival time

   0.00    0.00   clock wb_clk_i (rise edge)
   0.00    0.00   clock source latency
   0.02    0.02 ^ wb_clk_i (in)
   0.20    0.22 ^ clkbuf_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_16)
   0.28    0.50 ^ clkbuf_1_1_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.27    0.76 ^ clkbuf_1_1_1_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.24    1.00 ^ clkbuf_1_1_2_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.24    1.24 ^ clkbuf_1_1_3_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.49    1.73 ^ clkbuf_2_3_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.35    2.08 ^ clkbuf_3_7_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.35    2.43 ^ clkbuf_3_7_1_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.40    2.83 ^ clkbuf_4_14_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.30    3.13 ^ clkbuf_5_28_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.24    3.37 ^ clkbuf_6_56_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.90    4.27 ^ clkbuf_7_112_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_1)
   0.00    4.28 ^ _19222_/CLK (sky130_fd_sc_hd__dfxtp_4)
  -0.13    4.14   clock reconvergence pessimism
  -0.01    4.14   library hold time
           4.14   data required time
---------------------------------------------------------
           4.14   data required time
          -1.75   data arrival time
---------------------------------------------------------
          -2.39   slack (VIOLATED)

tnt

10/18/2021, 6:46 PM

Mmm .. digging a little more into this, the above might be unecessarely pessimistic.

tnt

10/18/2021, 6:47 PM

Because it uses minimum timing for one path and max timing for the other, but realistically a part of the chip can't be at 100C while the other is at negative 40C ...

tnt

10/18/2021, 6:54 PM

Running it again with only using typicall timing is ... well still a violation but less so ( 0.4 ns )

mehdi

10/18/2021, 7:28 PM

@tnt what are the .libs you are using in your design?

tnt

10/18/2021, 7:29 PM

This was

sky130_fd_sc_hd__ff_n40C_1v95.lib

sky130_fd_sc_hd__ss_100C_1v60.lib

tnt

10/18/2021, 7:29 PM

(note that I'm runnign this on my mpw-1 design using the mpw-1 tools, more as an academic exercice of "what should I have noticed a year ago and missed ...)

mehdi

10/18/2021, 7:45 PM

understood and thanks. but you used those 2 corners for 2 different libs? they can't be used for one lib? or am I misunderstanding something

tnt

10/18/2021, 7:47 PM

The default sdc doesboth a

read_liberty -min

and

read_liberty -max

to load both corners. And then it uses the slow one for setup analysis and the fast one for hold analysis. So far makes sense. However it seems that when doing clock network propagation it uses different corners for the source and destination clocks which sounds needlessly pessimistic.

mehdi

10/18/2021, 7:55 PM

ahh .. can you point me where you see that please!

tnt

10/18/2021, 7:58 PM

Look in the report I posted above.

tnt

10/18/2021, 7:59 PM

the first segment of the source and clock path delay is the same segment but is reported with different delays :

Copy code

0.02    0.02 ^ wb_clk_i (in)
   0.20    0.22 ^ clkbuf_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_16)

tnt

10/18/2021, 7:59 PM

Copy code

0.01    0.01 ^ wb_clk_i (in)
   0.08    0.09 ^ clkbuf_0_wb_clk_i/X (sky130_fd_sc_hd__clkbuf_16)

mehdi

10/18/2021, 9:12 PM

Hmm are you posting two different timing reports? I am actually not following (sorry). If you are just reporting timing (STA) then you ll get a pessimistic timing with 100C and optimistic with 40C (nominal voltage). I think I missed something in your description (I have read it twice though)

tnt

10/18/2021, 9:16 PM

See my post above : https://skywater-pdk.slack.com/archives/C02GTGEB5T3/p1634582035113800?thread_ts=1634581780.113200&cid=C02GTGEB5T3

tnt

10/18/2021, 9:17 PM

This is a single report entry. But there is the clock tree path delay for the source FF and the destination FF. And the beginning of those path have common segments.

mehdi

10/18/2021, 9:18 PM

Thanks! So you are saying in the same report the reported values are different? And the tool is mixing up both? Have you tried this with a recent version of OR? Can I have access to your testcase?

mehdi

10/18/2021, 9:23 PM

One issue I noticed during my own experiments is that some of the paths weren't correctly annotated and were missed by timing analysis. Those paths weren't critical though. But this is based on OR from August.

Matt Liberty

10/19/2021, 2:21 PM

@Tom Spyrou there is confusion about how corners are handled in sta - I think you could help clarify how multi-corner analysis should be done. Perhaps we even need to add to the docs

Maximo Balestrini

10/22/2021, 11:36 PM

@tnt using different .libs for the path that they have in common would be directly wrong more than pessimistic, right? (I'm new to all this)

Matt Liberty

10/23/2021, 3:28 AM

STA will handle the common clock portion correctly - the analysis is called clock reconvergence pessimism removal (crpr).

Matt Liberty

10/23/2021, 3:29 AM

The key is to differentiate on-die variation (variation across a single die) from multi-corner analysis.

Matt Liberty

10/23/2021, 3:31 AM

A decent writeup : https://vlsi.pro/common-path-clock-reconvergence-pessimism-removal/

👍 2

Maximo Balestrini

10/23/2021, 7:25 AM

thanks! I was missing that info about crpr.

tnt

10/23/2021, 10:12 AM

@Matt Liberty Yes, I saw the CRPR in OpenSTA, but even for the non-common part it's using the different min/max libs previously loaded. And as you say, I could see how this is useful if you load min/max libs representing on-die variations, but do we have those ? AFAICT the only ones we have are timings for the various corners and using those is overly pessimistic (that's what I was pointing out). I think this also means the various reports should be done in several OpenSTA calls and not in the same script since you need to load different libs. (unless re-issuing a

read_liberty -min

overrides the previous one ?).

Matt Liberty

10/23/2021, 4:42 PM

Yes there is an on-going discussion about getting OL to properly do multi-corner analysis. The current setup is treating corners as on-die variation which is wrong and gives too much pessimism.

✅ 2

Open in Slack

Previous Next