fwiw ORFS defaults to the number of cpus from nproc
m
fwiw ORFS defaults to the number of cpus from nproc
t
I think that explains sudden but very short bursts of all cores being used? Helpful, but doesn't do much for the overall flow. It would be nice if more tools picked up the number of cores from
nproc
. Openlane has an environment variable for
ROUTING_CORES
but doesn't set it in any examples (case in point: the Caravel user project example (!)), but it does make a big difference. In the openlane flow, magic's DRC is one of the lengthier items and there are certainly ways to parallelize that, although probably the better solution is just to migrate to klayout. Has anyone profiled the average runtime of the various tasks during synthesis/place/route and determined where the key chokepoints are?
m
I think ROUTING_CORES should default to nproc's value rather than a fixed 2
a
@Tim Edwards Here's where most of the time is spent when running Microwatt through Openlane. Times are in seconds:
Copy code
- drc - magic 2580.36
- detailed_routing - openroad 1028.05
- spice extraction - magic 557.69
- gdsii - magic 508.67
- synthesis - yosys 310.38
- global placement - openroad 218.56
- lvs - netgen 144.44
- sta - openroad 70.86
- sta - openroad 61.7
- sta - openroad 58.9
- resizer timing optimizations - openroad 54.07
- cts 50.47
Openlane writes the runtime of each step into
runtime.yaml
t
Definitely it shows that magic suffers from not being multiprocessing.
a
m
@Anton Blanchard how many cpus used for detailed routing?
a
@Matt Liberty 12 core, 24 thread (AMD)
This prompted me to stare at some profiles of detailed routing. There are definitely improvements we can make. Here's one idea which improves detailed routing performance by 8.5% on one of my Microwatt tests: https://github.com/The-OpenROAD-Project/OpenROAD/pull/2230