https://open-source-silicon.dev logo
#openroad
Title
# openroad
m

Matthew Guthaus

11/09/2021, 7:41 PM
How can I debug a segfault with re-placement? during executing: "openroad -exit /openlane/scripts/openroad/or_replace.tcl |& tee >&@stdout /project/openlane/user _project_wrapper/runs/user_project_wrapper/logs/placement/16-replace.log" Last 10 lines: child killed: segmentation violation
m

Matt Liberty

11/09/2021, 7:42 PM
is it an out of memory situation in docker? child killed sounds like an external agent
large designs can trigger that
m

Matthew Guthaus

11/09/2021, 7:43 PM
possibly.
m

Matt Liberty

11/09/2021, 7:43 PM
OL has a DOCKER_MEMORY variable to allow a user defined amount
👍 1
m

Matthew Guthaus

11/09/2021, 7:46 PM
Hm, seems like it defaults to 64G
m

Matt Liberty

11/09/2021, 7:48 PM
a big enough design could exceed the limit. What is the tail of 16-replace.log
m

Matthew Guthaus

11/09/2021, 7:48 PM
This is 10 macros plus a few hundred gates
m

Matt Liberty

11/09/2021, 7:50 PM
so probably not memory then.
log tail?
m

Matthew Guthaus

11/09/2021, 7:50 PM
There's a bunch of infos that have always existed: [INFO GRT-0209] Ignoring an obstruction on layer met5 outside the die area. [INFO GRT-0209] Ignoring an obstruction on layer met5 outside the die area. [INFO GRT-0209] Ignoring an obstruction on layer met5 outside the die area. [INFO GRT-0209] Ignoring an obstruction on layer met5 outside the die area. [INFO GRT-0209] Ignoring an obstruction on layer met5 outside the die area. Lots of warnings about pins outside the area (from the caravel harness)
m

Matt Liberty

11/09/2021, 7:53 PM
It could be in the global router (which placement calls) based on this. Can you try without PL_ROUTABILITY_DRIVEN ? That would narrow it down
Either way I think we'll need a testcase to debug if possible.
m

Matthew Guthaus

11/09/2021, 7:54 PM
Yeah. I'm re-running my MPW2 design with the new timing updates, so I'll debug a bit more and provide something.
Same error when PL_ROUTABILITY_DRIVEN is set to false
m

Matt Liberty

11/09/2021, 7:57 PM
what is the log tail then?
m

Matthew Guthaus

11/09/2021, 7:58 PM
Nothing seems to have changed...
m

Matt Liberty

11/09/2021, 7:58 PM
That doensn't make sense. You should get no GRT messages in that case
m

Matthew Guthaus

11/09/2021, 7:59 PM
Just this in my config, right? set ::env(PL_ROUTABILITY_DRIVEN) 0
Oh the default is false anyways
m

Matt Liberty

11/09/2021, 8:02 PM
or_replace.tcl is a mess. I think it doesn't turn off correctly.
Can you provide a test case for this and open an issue?
m

Matthew Guthaus

11/10/2021, 12:25 AM
Unfortunately, it seems to not fail any longer and gets to stage 29-opendp but then also segfaults
m

Matt Liberty

11/10/2021, 12:25 AM
what changed?
m

Matthew Guthaus

11/10/2021, 12:26 AM
I modified some verilog tests. That is all
Completely unrelated
m

Matt Liberty

11/10/2021, 12:26 AM
that sounds suspicious.... what is the opendp failure?
m

Matthew Guthaus

11/10/2021, 12:26 AM
I also ran make clean?
The opendp failure is related to escaping some signal names. My clock is io_in[17]. I need to escape the [] in config.tcl but not in base.sdc or else STA won't recognize the clock. However, or_opendp.tcl complains if I don't have it escaped:
Copy code
invalid command name "17"
    while executing
"17"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 ${cmd}"
    (procedure "set_log" line 3)
    invoked from within
"set_log ::env($index) $escaped_env_var $::env(GLB_CFG_FILE) 1"
    (procedure "save_state" line 9)
    invoked from within
"save_state"
    (procedure "flow_fail" line 6)
    invoked from within
"flow_fail"
    (procedure "try_catch" line 25)
    invoked from within
"try_catch $::env(OPENROAD_BIN) -exit $::env(SCRIPTS_DIR)/openroad/or_opendp.tcl |& tee $::env(TERMINAL_OUTPUT) [index_file $::env(opendp_log_file_tag)..."
This actually could be what changed for global routing too...
m

Matt Liberty

11/10/2021, 12:32 AM
opendb doesn't do anything with timing.
opendp rather
m

Matthew Guthaus

11/10/2021, 12:32 AM
Why did it read my SDC?
m

Matt Liberty

11/10/2021, 12:34 AM
what is in the opendp log?
I don't see a read_sdc in or_opendp.tcl
m

Matthew Guthaus

11/10/2021, 12:35 AM
Just a segfault with no real context
m

Matt Liberty

11/10/2021, 12:35 AM
where is invalid command name "17" coming from?
m

Matthew Guthaus

11/10/2021, 12:35 AM
That is the text in my signal name: io_in[17]
m

Matt Liberty

11/10/2021, 12:36 AM
I understand but who is trying to execute that name?
m

Matthew Guthaus

11/10/2021, 12:37 AM
or_opendp.tcl... Rerunning to get the remainder of the stack trace I didn't paste.
m

Matt Liberty

11/10/2021, 12:38 AM
Do you see anything in that script that would access the name of your clock? I'm looking at the version in master and I don't see anything
m

Matthew Guthaus

11/10/2021, 12:38 AM
Copy code
invalid command name "17"
    while executing
"17"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 ${cmd}"
    (procedure "set_log" line 3)
    invoked from within
"set_log ::env($index) $escaped_env_var $::env(GLB_CFG_FILE) 1"
    (procedure "save_state" line 9)
    invoked from within
"save_state"
    (procedure "flow_fail" line 6)
    invoked from within
"flow_fail"
    (procedure "try_catch" line 25)
    invoked from within
"try_catch $::env(OPENROAD_BIN) -exit $::env(SCRIPTS_DIR)/openroad/or_opendp.tcl |& tee $::env(TERMINAL_OUTPUT) [index_file $::env(opendp_log_file_tag)..."
    (procedure "detailed_placement_or" line 6)
    invoked from within
"detailed_placement_or"
    (procedure "run_routing" line 32)
    invoked from within
"run_routing"
    (procedure "run_routing_step" line 10)
    invoked from within
"[lindex $step_exe 0] [lindex $step_exe 1] "
    (procedure "run_non_interactive_mode" line 43)
    invoked from within
"run_non_interactive_mode {*}$argv"
    invoked from within
"if { [info exists flags_map(-interactive)] || [info exists flags_map(-it)] } {
        puts_info "Running interactively"
        if { [info exists arg_values(-file)..."
    (file "/openlane/flow.tcl" line 356)
make[1]: *** [Makefile:43: user_project_wrapper] Error 1
make[1]: Leaving directory '/home/mrg/openram_testchip/openlane'
make: *** [Makefile:70: user_project_wrapper] Error 2
The log output is useless.
It's reporting overlaps then segfaults [WARNING DPL-0005] Overlap check failed (16972).
repeater448 overlaps ANTENNA_repeater448_A repeater449 overlaps ANTENNA_repeater449_A repeater451 overlaps ANTENNA_repeater451_A [ERROR]: during executing: "openroad -exit /openlane/scripts/openroad/or_opendp.tcl |& tee >&@stdout /project/openlane/user_project_wrapper/runs/user_project_wrapper/logs/placement/29-opendp.log" [ERROR]: Exit code: 1 [ERROR]: Last 10 lines: child process exited abnormally [ERROR]: Please check openroad log file [ERROR]: Dumping to /project/openlane/user_project_wrapper/runs/user_project_wrapper/error.log
m

Matt Liberty

11/10/2021, 12:42 AM
I guess the 17 is a red herring. For the crash a test case is best as I can't guess from this what the problem is
you mentioned having macros - are there placement sites in the channels ?
I have seen a case recently where the channel was so narrow no instances could be placed there
m

Matthew Guthaus

11/10/2021, 12:44 AM
It's big. This successfully routed for MPW2
One dumb question. There is now a config.json in addition to config.tcl with duplicate information. Why are they both there?
m

Matt Liberty

11/10/2021, 12:45 AM
sorry but I guess I need to look at it
I am not much of an openlane expert, I mostly work on openroad. @User can you explain config.json vs config.tcl?
m

Matthew Guthaus

11/10/2021, 12:46 AM
Thanks for your help. I'll add a test case and/or debug a bit more
m

Matt Liberty

11/10/2021, 12:47 AM
np
d

donn

11/10/2021, 8:00 AM
The .json is there so users can be allowed to customize things on platforms where freely modifiable Tcl would constitute a security concern, for example the efabless platform and the OpenLane cloud runner. It’s just an alternative that you’re free to pick.
m

Matthew Guthaus

11/10/2021, 11:55 AM
@User what happens if both are there like in the example?
d

donn

11/10/2021, 11:57 AM
tcl's prioritized
Do note that I mean only Tcl will be loaded. JSON will be ignored entirely. If the tcl config's missing, it will attempt to load a json config. If the json config's missing as well, flow.tcl will throw an error.
m

Matthew Guthaus

11/10/2021, 5:12 PM
@User yeah, thanks for the clarification.
After wrestling with a fresh install of openlane/pdk, I can reproduce this with or_opendp again:
Copy code
invalid command name "17"
    while executing
"17"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 ${cmd}"
    (procedure "set_log" line 3)
    invoked from within
"set_log ::env($index) $escaped_env_var $::env(GLB_CFG_FILE) 1"
    (procedure "save_state" line 9)
    invoked from within
"save_state"
    (procedure "flow_fail" line 6)
    invoked from within
"flow_fail"
    (procedure "try_catch" line 25)
    invoked from within
"try_catch $::env(OPENROAD_BIN) -exit $::env(SCRIPTS_DIR)/openroad/or_opendp.tcl |& tee $::env(TERMINAL_OUTPUT) [index_file $::env(opendp_log_file_tag)..."
    (procedure "detailed_placement_or" line 6)
    invoked from within
"detailed_placement_or"
    (procedure "run_routing" line 32)
    invoked from within
"run_routing"
    (procedure "run_routing_step" line 10)
    invoked from within
"[lindex $step_exe 0] [lindex $step_exe 1] "
    (procedure "run_non_interactive_mode" line 43)
    invoked from within
"run_non_interactive_mode {*}$argv"
    invoked from within
"if { [info exists flags_map(-interactive)] || [info exists flags_map(-it)] } {
        puts_info "Running interactively"
        if { [info exists arg_values(-file)..."
    (file "/openlane/flow.tcl" line 356)
make[1]: *** [Makefile:43: user_project_wrapper] Error 1
make[1]: Leaving directory '/home/mrg/openram_testchip/openlane'
make: *** [Makefile:70: user_project_wrapper] Error 2
The "17" is in the name of my clock in my base.sdc or my config.tcl file:
Copy code
set ::env(CLOCK_PORT) {io_in[17]}
If I don't use the base.sdc, it still does the above so it must be something with the config.tcl. I have the name escaped there:
Copy code
set ::env(CLOCK_PORT) {io_in\[17\]}
If I look at the generated SDC files, however, the name is unescaped:
Copy code
mrg@diode ~/openram_testchip/openlane/user_project_wrapper/runs/user_project_wrapper (main)$ find . -name \*.sdc -exec grep create_clock {} \; -print
create_clock -name io_in[17] -period 30.0000 [get_ports {io_in[17]}]
./results/cts/user_project_wrapper.cts.sdc
create_clock -name io_in[17] -period 30.0000 [get_ports {io_in[17]}]
./tmp/floorplan/4-verilog2def.sdc
create_clock -name io_in[17] -period 30.0000 [get_ports {io_in[17]}]
./tmp/placement/23-resizer_timing.sdc
create_clock -name io_in[17] -period 30.0000 [get_ports {io_in[17]}]
./tmp/placement/21-resizer_timing.sdc
create_clock -name io_in[17] -period 30.0000 [get_ports {io_in[17]}]
./tmp/placement/16-resizer.sdc
So there are two questions: 1. why isn't write_sdc escaping the name properly? 2. why is or_opendp using the SDC at all?
@User ^^
m

Matt Liberty

11/10/2021, 7:52 PM
@User
m

Matthew Guthaus

11/10/2021, 7:54 PM
@User This may actually be a red herring like @User mentioned before. This looks like it is part of the "save_state" function which is probably trying to write out the SDC (or config.tcl) after an error. opendp probably doesn't use the SDC (or clock at all) but the fail triggers this saving. I'm running it now without any clock defined to see if I can identify solve the real error.
OH, so this failure is actually during routing when it is trying to legalize the diodes. This is why timing is enabled...
GAH, and it can't legalize the diodes because they are "sprayed" all over the macros and can't be moved outside of the macros.
m

Matt Liberty

11/10/2021, 11:23 PM
which DIODE_INSERTION_STRATEGY are you using?
m

Matthew Guthaus

11/10/2021, 11:25 PM
I was using spray because the others caused problems during the MPW2 tool flow
But spray won't work if you have macros now
m

Matt Liberty

11/10/2021, 11:25 PM
which one is spray?
m

Matthew Guthaus

11/10/2021, 11:26 PM
1
I'm uncertain how the others would work. If they put a diode on a macro it won't work
m

Matt Liberty

11/10/2021, 11:26 PM
1. "A diode is inserted for each PIN and connected to it. ?
m

Matthew Guthaus

11/10/2021, 11:26 PM
"Spray diodes"
Specifies the insertion strategy of diodes to be used in the flow. 0 = No diode insertion, 1 = Spray diodes, 2 = insert fake diodes and replace them with real diodes if needed. 3= use FastRoute Antenna Avoidance flow, 4 = Use Sylvian's Custom Script for diode insertion on design pins and smartly inserting needed diodes inside the design, 5 = a mix of strategy 2 and 4. (Default: 3)
m

Matthew Guthaus

11/10/2021, 11:26 PM
3 used to not work for some reason
m

Matt Liberty

11/10/2021, 11:27 PM
maybe you have an older version?
Yes, I had an older version during MPW2 🙂
Looks like those are in conflict with eachother. Maybe spray puts them randomly and then tries to connect them to each pin?
I think 3 used to not work because they were using a different router before, if I recall?
TritonRoute vs FastRoute?
m

Matt Liberty

11/10/2021, 11:31 PM
I'm not sure of the history. @User would you clarify the doc discrepancy between the README and hardening_macros on
DIODE_INSERTION_STRATEGY
described above
from what I can see in routing.tcl it looks to like 1 is closer to "A diode is inserted for each PIN and connected to it. " versus spraying but I guess that doesn't match your experience
it looks like it is trying to put a diode on each pin directly and then let detailed placement legalize it
that's potentially a lot of diodes so it might not be possible to legalize them in a dense enough design area
however I would expect that would lead to diodes on the pins not deep inside the macros
m

Matthew Guthaus

11/10/2021, 11:59 PM
I see. I removed the diodes entirely and it seems to still have issues legalizing clock buffers (and filler?). I need to figure out that red herring save_state bug too though. I'm finally digging more into the openlane/openroad flow so I have a better understanding of things under the hood now.
m

Matt Liberty

11/10/2021, 11:59 PM
there should be no fillers when you are running detailed placement. They should happen after otherwise there will be no empty sites
m

Matthew Guthaus

11/11/2021, 12:00 AM
I get lots of:
Copy code
repeater432 overlaps FILLER_2_3245
 repeater433 overlaps FILLER_406_3269
 repeater435 overlaps FILLER_631_2825
 repeater437 overlaps FILLER_2_4701
before it unelegantly gives up
This is all after routing
m

Matt Liberty

11/11/2021, 12:00 AM
if you are inserting diodes after routing then you should delay filler insertion to after that
the design will be 100% full after filler insertion and nothing else will fit
m

Matthew Guthaus

11/11/2021, 12:01 AM
That might be an openlane issue. This is the relevant stack:
Copy code
invoked from within
"detailed_placement_or"
    (procedure "run_routing" line 32)
    invoked from within
"run_routing"
    (procedure "run_routing_step" line 10)
There are calls to ins_fill_cells after ins_diode_cells but before detailed_place_or
So it may take up the space and not be able to legalize
Yeah, it runs ins_fill_cells before legalization of the diodes. That is the problem.
@User ^^
m

Matt Liberty

11/11/2021, 1:07 AM
that sounds worth an issue if you don't hear from @User
m

Mitch Bailey

11/12/2021, 4:46 PM
@User @User Maybe this has been resolved, but I was able to reproduce the
Copy code
invalid command name "17"
    while executing
"17"
error and have a work around. It occurs when there is an existing
<design>/runs/<tag>/config.tcl
file. Deleting this file works for me. When this file is created, the clock port is defined as below, but it looks like the routine that reads this and rewrites it can't handle
]
or
[
.
Copy code
set ::env(CLOCK_PORT) "io_in\[17\]"
I believe the permanent solution is to patch the
save_state
routine in
scripts/tcl_commands/all.tcl
with the following
Copy code
set escaped_env_var [string map {\[ \\\[} $escaped_env_var]
            set escaped_env_var [string map {\] \\\]} $escaped_env_var]
I'll submit a PR once I test it.
m

Matthew Guthaus

11/12/2021, 4:57 PM
Hi @User I had gotten to that point as well and even had that same fix but I don't see it resolving the issue. Sometimes I feel like the scripts are cached somewhere and don't seem to update when I run though...
m

Mitch Bailey

11/12/2021, 6:28 PM
@User I found another place
proc prep
in
all.tcl
that looks like it's trying to write out the
config.tcl
file. However, the same type of fix doesn't work as expected. I'll dig deeper tomorrow. Incidentally, I've noticed that the
config.tcl
file has a lot of duplicate entries and sometimes the values don't match.
👍 1
m

Matthew Guthaus

11/12/2021, 7:06 PM
@User Our day is starting so I'll let you know what I find today.
2 Views