How long does netgen usually run for people on a u...
# verification-be
m
How long does netgen usually run for people on a user_project_wrapper with ~600k instances (including fill and decap)?
t
Netgen runs depend on a lot of things. The core algorithm is O(log(N)) and is extremely fast no matter how big a design is. But, there's a flattening algorithm that is rather slow, so that if you end up failing some internal cell that is used many times and it gets flattened everywhere, then netgen can suddenly take ages to run. The read-in of large netlists is also not particularly efficient, but it shouldn't be overly obnoxious at 600k gates.
m
None of my subcells are failing (they are all blackbox). They all match pins between the designs. It's basically right after that at the top level
Copy code
Contents of circuit 1:  Circuit: 'user_project_wrapper'                                                                         
Circuit user_project_wrapper contains 629357 device instances.                                                                  
  Class: sky130_fd_sc_hd__clkbuf_16 instances: 122                                                    
  Class: sky130_fd_sc_hd__dfxtp_2 instances: 512                                                      
  Class: sky130_fd_sc_hd__or4_2 instances:  40                                                                                                                                                               
  Class: sky130_fd_sc_hd__buf_1 instances: 287                                                                                                                                                               
  Class: sky130_fd_sc_hd__buf_2 instances:   1                                                        
  Class: sky130_sram_4kbyte_1rw1r_32x1024_8 instances:   1                                                                                                                                                   
  Class: sky130_sram_1kbyte_1rw1r_32x256_8 instances:   1                                                                       
  Class: sky130_fd_sc_hd__inv_2 instances:  14                                                        
  Class: sky130_fd_sc_hd__clkbuf_1 instances: 120                                                     
  Class: sram_1rw0r0w_64_512_sky130 instances:   1                                                                                                                                                           
  Class: sky130_fd_sc_hd__or3_2 instances:   1                                                        
  Class: sky130_fd_sc_hd__and2b_2 instances:   1                                                      
  Class: sky130_fd_sc_hd__conb_1 instances: 139                                                       
  Class: sram_1rw0r0w_32_1024_sky130 instances:   1                                                   
  Class: sky130_sram_2kbyte_1rw1r_32x512_8 instances:   1                                                                       
  Class: sky130_sram_1kbyte_1rw1r_8x1024_8 instances:   1                                                                       
  Class: sky130_fd_sc_hd__decap_3 instances: 6693                                                     
  Class: sky130_sram_8kbyte_1rw1r_32x2048_8 instances:   1                                                                      
  Class: sky130_fd_sc_hd__decap_4 instances: 4211                                                     
  Class: sky130_fd_sc_hd__decap_6 instances: 2770                                                     
  Class: sky130_fd_sc_hd__decap_8 instances: 95804                                                    
  Class: sky130_fd_sc_hd__dlygate4sd3_1 instances: 944                                                                                                                                                       
  Class: sram_1rw0r0w_32_512_sky130 instances:   1                                                    
  Class: sky130_fd_sc_hd__or2_2 instances:  15                                                        
  Class: sky130_fd_sc_hd__mux2_1 instances: 193                                                                                                                                                              
  Class: sky130_fd_sc_hd__and2_2 instances: 448                                                                                                                                                              
  Class: sky130_fd_sc_hd__or4b_2 instances:   1                                                                                                                                                              
  Class: sram_1rw0r0w_32_256_sky130 instances:   1                                                                                                                                                           
  Class: sky130_fd_sc_hd__o221a_2 instances: 112                                                      
  Class: sky130_fd_sc_hd__o21bai_2 instances:   2                                                     
  Class: sky130_fd_sc_hd__diode_2 instances: 6062                                                     
  Class: sky130_fd_sc_hd__a221o_2 instances:  32                                                      
  Class: sky130_fd_sc_hd__a22o_2 instances: 160                                                       
  Class: sky130_fd_sc_hd__tapvpwrvgnd_1 instances: 101690                                                                                                                                                    
  Class: sky130_fd_sc_hd__fill_1 instances: 2642                                                                                                                                                             
  Class: sky130_fd_sc_hd__fill_2 instances: 7151                                                      
  Class: sky130_fd_sc_hd__decap_12 instances: 399181                                                                            
Circuit contains 3845 nets, and 305 disconnected pins.                                                                          
Contents of circuit 2:  Circuit: 'user_project_wrapper'                                                                         
Circuit user_project_wrapper contains 629357 device instances.                                                                  
  Class: sky130_fd_sc_hd__clkbuf_16 instances: 122                                                    
  Class: sky130_fd_sc_hd__dfxtp_2 instances: 512                                                      
  Class: sky130_fd_sc_hd__or4_2 instances:  40                                                                                                                                                               
  Class: sky130_fd_sc_hd__buf_1 instances: 287                                                                                                                                                               
  Class: sky130_fd_sc_hd__buf_2 instances:   1                                                        
  Class: sky130_sram_4kbyte_1rw1r_32x1024_8 instances:   1                                                                      
  Class: sky130_sram_1kbyte_1rw1r_32x256_8 instances:   1                                                                                                                                                    
  Class: sky130_fd_sc_hd__inv_2 instances:  14                                                        
  Class: sky130_fd_sc_hd__clkbuf_1 instances: 120                                                     
  Class: sram_1rw0r0w_64_512_sky130 instances:   1                                                                                                                                                           
  Class: sky130_fd_sc_hd__or3_2 instances:   1                                                        
  Class: sky130_fd_sc_hd__conb_1 instances: 139                                                       
  Class: sky130_fd_sc_hd__and2b_2 instances:   1                                                      
  Class: sram_1rw0r0w_32_1024_sky130 instances:   1                                                   
  Class: sky130_sram_2kbyte_1rw1r_32x512_8 instances:   1                                                                       
  Class: sky130_sram_1kbyte_1rw1r_8x1024_8 instances:   1                                                                       
  Class: sky130_fd_sc_hd__decap_3 instances: 6693                                                     
  Class: sky130_fd_sc_hd__decap_4 instances: 4211                                                     
  Class: sky130_sram_8kbyte_1rw1r_32x2048_8 instances:   1                                                                      
  Class: sky130_fd_sc_hd__decap_6 instances: 2770                                                     
  Class: sky130_fd_sc_hd__decap_8 instances: 95804              
  Class: sky130_fd_sc_hd__dlygate4sd3_1 instances: 944                                                                          
  Class: sram_1rw0r0w_32_512_sky130 instances:   1              
  Class: sky130_fd_sc_hd__or2_2 instances:  15                  
  Class: sky130_fd_sc_hd__mux2_1 instances: 193                 
  Class: sky130_fd_sc_hd__and2_2 instances: 448                 
  Class: sky130_fd_sc_hd__or4b_2 instances:   1                 
  Class: sram_1rw0r0w_32_256_sky130 instances:   1              
  Class: sky130_fd_sc_hd__o221a_2 instances: 112                
  Class: sky130_fd_sc_hd__o21bai_2 instances:   2               
  Class: sky130_fd_sc_hd__diode_2 instances: 6062               
  Class: sky130_fd_sc_hd__a221o_2 instances:  32                
  Class: sky130_fd_sc_hd__a22o_2 instances: 160                 
  Class: sky130_fd_sc_hd__tapvpwrvgnd_1 instances: 101690                                                                       
  Class: sky130_fd_sc_hd__fill_1 instances: 2642                
  Class: sky130_fd_sc_hd__fill_2 instances: 7151                
  Class: sky130_fd_sc_hd__decap_12 instances: 399181                                                                            
Circuit contains 3714 nets, and 369 disconnected pins.
There are a different number of nets, but I'm waiting to see what it says and it is taking forever. From visual inspection, the significant parts of the netlist are all correct
t
It's possible that the parallelizing (which would happen on fill, decap, and tap cells) is as slow as the flattening, since it's essentially the same kind of linked-list initialization.
m
Ah, should I disable that since these won't be parallel...
I'm really surprised if it is something like that nobody else has complained.
t
I could look at the example, or if you want to check what it's doing, my usual way is not to profile it but just to run "gdb program <pid>" to hook gdb to it and then break a few times and see what the stack trace looks like.
m
Yeah. gdb in the docker though
Or can you connect outside? hrm
t
I'm not sure if this LVS is run through OpenLane? But there's an environment variable in the netgen setup file that causes fill, decap, and tap to be ignored.
m
Yeah, it is through OpenLane
t
There's also something wrong here. Your net count differs by over 100 between the netlists.
m
I know. I can't get there to debug it 🙂
Visually, nothing is wrong when looking at it
(This has been running all day)
I'm wondering if there's an issue with so many macros that have large numbers of pins
such as in the signature generation algorithm for the nets/pins
t
Even if it is slow, it should eventually finish and then you can look at the output; I can't think of any other way to figure out what's going on if you can't access it directly with gdb. I'm not aware of any issue with pins, but again, dealing with any pin issues is more linked-list manipulation.
m
It passes pretty quickly if I skip the fill, decap and tap 😞
Sorry, I skipped the fill/tap and disabled parallel on the decap
So, if I disable the parallel stuff, it completes pretty quickly and outputs that it has the same number of nets:
Copy code
Circuit 1 contains 629357 devices, Circuit 2 contains 629357 devices.                                                                                                     
Circuit 1 contains 3714 nets,    Circuit 2 contains 3714 nets.
LVS clean. What does it do between this and the previous output I posted?
netgen issue. I use 1.5.166 on my computer, openlane has 1.5.167 which is broken. commit 402e1f0f254c6c0cf36c90ff5d6a09134b9196ba Author: Tim Edwards <tim@opencircuitdesign.com> Date: Tue Feb 16 171200 2021 -0500 Found a chokepoint in FlattenInstancesOf that was unnecessary as it was running through the entire object linked list to find the predecessor of a record that it had already found. Solved by simply keeping track of the predecessor record.
That actually slowed it down considerably. Not sure if I should roll back in the openlane docker or upgrade
t
Neither of those is good. I did destabilize netgen for a few versions, unfortunately. But the current version appears to be working well. That's version 1.5.194 (on opencircuitdesign.com; github is one revision behind).
m
This kind of defeats the purpose of docker :/