<#1201 NesterovSolve stuck, eats all available RAM...
# openlane-development
g
#1201 NesterovSolve stuck, eats all available RAM, gets killed Issue created by Syndace Description Hi. I'm in the awkward position of encountering a bug that I can only reproduce with IP that I'm not allowed to share. I'll try to describe everything as detailed as possible in the hope that someone might have an idea even without being able to reproduce locally. First for context: the research project I'm working on attempts to build a rather simple RISC-V core with a few extensions and some SRAM memory. However, not using the Skywater PDK, but using a proprietary PDK that we have added to OpenLane locally. The general PDK configuration seems fine, the flow successfully completes on the project without the SRAM. The part causing trouble is the SRAM: the SRAM cell was generated by the fab providing the PDK and provided to us in the shape of lib/lef/gds files, behavioral models etc. as usual. Possibly important here is that the SRAM cell is not small and takes up almost one square millimeter on the chip. What happens is that as soon as at least one of those SRAM cells is instantiated by the design, the initial global routing step gets stuck at some iteration of NesterovSolve, slowly eats away all of the available memory, and then either gets killed by the OS or just keeps being stuck like that. When I say "all of the available memory" I really mean it - I added 500GB of swap out of curiosity/to be absolutely sure and let it run for a day; it ate the full 500GB + the physical RAM I have. To track down the problem I have created a minimal project which consists of nothing but one of those memory cells (the following is from the synthesis statistics report):
Copy code
Number of cells:                  1
     Name_Of_Memory_Cell      1
and even this project gets stuck during the initial global placement (step 5 of the flow):
Copy code
[NesterovSolve] Iter: 1 overflow: 0 HPWL: 550190766
When compiling the full project with RISC-V core and SRAM, I'm able to get past the initial global placement by setting
PL_BASIC_PLACEMENT
. When doing that, the flow gets stuck in the same fashion a few steps later, at step 10: "Running Placement Resizer Design Optimizations". The last log line I get in that case is:
Copy code
[INFO RSZ-0058] Using max wire length 2489um.
I have tried to reproduce the issue with memory generated by OpenRAM and the Skywater PDK, however I wasn't able to. Notably I wasn't able to generate a RAM cell of comparable size to the proprietary one, so there's still the chance that the size plays a role. The lib/lef/gds of the SRAM cell are included in the build using
EXTRA_LIBS
,
EXTRA_LEFS
and
EXTRA_GDS_FILES
. For completeness: I've tried to build the project with the proprietary SRAM cell using the Skywater PDK, which doesn't make much sense since this mixes PDKs that have different layers, DRCs etc., but the flow got to the global placement anyway and remained stuck there in the same fashion. I hope this information gives at least a hint of what the cause might be - it's something common to at least the initial global placement and the placement resizer design optimizations. I'll gladly try anything you can think of/run debug builds or anything like that. Environment
Copy code
Kernel: Linux v5.10.0-15-amd64
Distribution: debian 11
Python: v3.9.2 (OK)
Container Engine: docker v20.10.5+dfsg1 (OK)
OpenLane Git Version: 83b61459abcaef3b0ddcf022e1a67e71c88f6007
pip: INSTALLED
pip:venv: INSTALLED
---
PDK Version Verification Status: OK
---
Git Log (Last 3 Commits)

83b6145 2022-06-26T18:19:11+02:00 Fix #1163 (#1164) - Mohamed Gaber -  (HEAD -> master, tag: 2022.06.27_01.36.21, origin/master, origin/HEAD)
a633b1f 2022-06-23T16:53:24+02:00 Rewrite pin spacing algorithm (#1160) - Mohamed Gaber -  (tag: 2022.06.24_01.37.58)
ebad315 2022-06-21T19:56:21+02:00 Fix Antenna Checkers, Magic Script Enhancements (#1154) - Mohamed Gaber -  (tag: 2022.06.22_01.42.47)
Other sections don't apply. The-OpenROAD-Project/OpenLane