I have a weird spice problem, where changing the t...
# analog-design
m
I have a weird spice problem, where changing the transient simulation from 40us to 400us completely changes the output waveform
and the tt_um_mattvenn_r2r_dac.sim.spice file is here if you want to try
with the transient set to 40us (or smaller) I see the digital block start up and the R2R dac converts the output
message has been deleted
If I want to see more of the signal, I change the transient simulation time to 400us and get this
message has been deleted
has anyone else had a situation where changing transient time changes the start of the simulation like this?
any ideas what is going wrong?
l
The only times I have experienced something similar to what you are seeing is when I had a typo somewhere in my simulation. I Typically would have an error in my PWL voltage source statement.
m
the digital block is just a binary counter, so the 1st output pin should just be toggling with the clock
but check this out
message has been deleted
at 17.5us, it just spikes up, doesn't latch the data
t
@Matt Venn: Your "full_spice_sim" netlist should not have
.lib
on the first line---The first line is by definition ignored in SPICE. ngspice may or may not try to handle that intelligently. Not that that has any impact on your simulation, but it's one thing I noticed immediately (still working on duplicating your results).
m
updated spice
I have fixed a few other things
you can rebuild this sim spice with 'make tt_um_mattvenn_r2r_dac.sim.spice'
also please pull the repo if you are trying to replicate
t
Okay, let me restart by cloning the repository, then.
πŸ™Œ 1
m
I made a mistake by trying to iterate faster and extracting less parasitics, but that resulted in the floating cap issue I mentioned in another thread
t
@Matt Venn: Where would I run
make tt_um_mattvenn_r2r_dac.sim.spice
? There isn't a Makefile in the sim/ diretory, and the top level Makefile has no such recipe.
I see it in mag/.
m
yes in mag
t
Confirmed that it works at 20us transient. It appears to be working the same for a 400us transient (in progress), based on the periodic slowing and speeding of the time steps. Can you post the exact top level netlist that produces the bad behavior?
m
can you try this on the 20us
plot "xtt.r2r_dac_control_0.r2r_out[0]"
see if you can reproduce my flop glitch
as for the exact top level netlist, it's on my work pc
let me try the 400us here
on my most recent extracted netlist
t
I do get this message at the beginning of the simulation:
Warning: Optran step size potentially too large.
which may be relevant.
plot "xtt.r2r_dac_control_0.r2r_out[0]"
produces a normal looking output, no glitching.
@Matt Venn: Other things that you are doing wrong (which should have nothing to do with glitching output): 1. Enabling monte carlo mode with a typical corner (probably just means that it's not doing monte carlo) 2. Enabling mismatch analysis by setting the flag instead of setting the corner type (minor nit) 3. Using the old discrete models (ngspice/sky130.lib.spice) instead of the combined models (combined/sky130.lib.spice).
Also watch how much you save to the output. The 400us transient, as written, dumps a 3.6GB raw file.
@Matt Venn: Works fine for me at 400us transient.
m
Ok good. Did you have to change anything?
Can you send the extracted netlist and your ngspice version?
πŸ‘€ 1
t
Literally the only thing I changed was the transient time. Okay, technically untrue. I added a title line at the top and changed the path to the PDK in the .lib line.
πŸ™Œ 1
ngspice version 43+.
m
message has been deleted
πŸ˜‚
t
I like my version better.
m
not the step time?
I changed the step time and that seemed to help
t
What version of ngspice are you using?
m
42
t
I suggest you update to current version 43+. Apparently something in it really does make a difference.
@Ahmed Reda has similarly seen simulation differences between version 42 and version 43+.
t
ngspice 43 enables the KLU solver by default, maybe it's not enabled in the 42 build Matt's using and that might be one difference.
m
I'm trying a simpler example to iterate faster and work out what the issue is, but now I'm unable to get anything out of my simple example
message has been deleted
it's like the sub circuit and all it's outputs are being optimised out for some reason and I can't find it
circuit was extracted for lvs, so it was black boxed πŸ™‚
message has been deleted
t
@tnt: Matt's example is enabling the KLU solver in the .spiceinit file, so as long as ngspice supports KLU, it will be used.
m
now I can't reproduce 😞
OK, I think I have it
message has been deleted
top is ngspice42 and bottom is ngspice43, but crucially, the spiceinit file is
Copy code
set ngbehavior=hsa     ; set compatibility for reading PDK libs                                                              
set skywaterpdk        ; skip some checks for faster lib loading                                                             
set ng_nomodcheck      ; don't check the model parameters                                                                    
set num_threads=8      ; CPU processor cores available
I'll enable klu solver for ngspice42 and see if that makes the difference
ngspice43 appears to default klu
enabling klu for ngspice42 without 'option noinit' and 'optran' result in a failed simulation
nope, ngspice42 fails with klu too
message has been deleted
so now my guess is that the montecarlo switch is introducing some randomness
t
The "PR" switch may or may not produce randomness depending on whether equations involving the switch exist at all in the parameters used by the corner models, or if those parameters only exist in the monte carlo models, which are different. The "MM" switch definitely produces random behavior and is equivalent to selecting, e.g., corner "tt_mm" instead of corner "tt". The switches are not meant to be set by the end user. They are meant to be set as part of the simulation corner definition.
m
yes so with mm switch, I see the problematic waveforms, without it not
message has been deleted
so my guess atm is that I'm right on the edge of too coarse a timestep in the simulation, and the randomness makes it fail in these strange ways
I'll investigate that more later
yeah, reducing the timestep by 10x fixes the issue
t
My conclusion from experience is that ngspice needs work on the implementation of adaptive time stepping. It appears to be very bad at this compared to commercial simulators, at least in certain circumstances. It isn't a show-stopper for simulation, but it forces one to keep reducing the maximum timestep and therefore makes simulations take way more time and memory than they ought to. With a carefully-controlled experiment like this, I think we have enough information to pass it along as a bug report to @holger vogt and @Dietmar Warning .
m
It's definitely a trap for the unwary!
e
@Matt Venn, @Tim Edwards. In SPICE the duration of the transient sim implicitly affects the max timestep in a transient sim, unless you specify a max step. This is a snapshot from p. 341 of The Designer's Guide to SPICE and Spectre by Ken Kundert. This may be why your changing of the transient duration gave different results. Specifying a max timestep and/or tightening reltol, abstol, vntol will also change the simulator's decisions about timestepping. (FWIW, my observation in running NGSPICE sims is that default rel/abs/vntol are definitely too loose to get reliable results.)
t
@Eric: Are there any "rules of thumb" about how to set reltol and abstol options to useful values?
e
@Tim Edwards Some rules of thumb for setting tolerances from Ken Kundert's book. See "dc analysis practice" p. 37 #14: Absolute tolerances (abstol and vntol) should be set around 1e-6 but no smaller than 1e-9 times the largest similar quantity present in the circuit. For example, in ICs with voltages from 5 to 15 V, vntol is set to 1 uV, and with currents around 1 uA, abstol is set to 1 pA. In the sections on dc and transient accuracy problems, he says "*In general, if you would like your simulator to produce a more accurate solution, tighten reltol. Also make sure abstol and vntol are reasonable.* (See p. 37 #14). (For those familiar with Spectre, he explains that "conservative" scales reltol by 0.1x and "liberal" scales reltol by 10x from Spectre's default.) There is a good section in this book on how local truncation error in the computation of the solution interacts with time stepping. He generally recommends tightening reltol first before limiting the max timestep,
πŸ™Œ 1
t
Absolute tolerances (abstol and vntol) should be set around 1e-6 but no smaller than 1e-9 times the largest similar quantity present in the circuit. For example, in ICs with voltages from 5 to 15 V, vntol is set to 1 uV, and with currents around 1 uA, abstol is set to 1 pA. So for typical sky130 circuits with voltages up to 1.8 or 3.3V, the default
vntol
of
1uV
should be proper (which may be why I've never seen anyone suggest altering it); for micro-power circuits, the default
abstol
of
1pA
would also be proper, but a high-drive amplifier with a no-load current draw of
1mA
would need
abstol
of
1nA
. I think the majority of circuits I've been simulating would be fine with the defaults. In general, if you would like your simulator to produce a more accurate solution, tighten reltol.
reltol
default is
1e-3
, so "tightening"/"conservative" means
1e-4
and "loosening"/"liberal" means
1e-2
. But I usually find myself tweaking this not because I want more or less accuracy in the simulation, but because the thing is failing to converge and I'm trying to figure out how to get around that. In the cases we've seen recently, having a flip-flop fail to flip or flop seems a bit more problematic than a less-accurate solution. There are also other options like
chgtol
and
trtol
which I never mess with and don't know if I should.
h
I have been running Matt's example file, varying TSTEP from 1n to 500n, without any extra options given (ngspice-44.2 GUI on Windows 10). I absolutely do not get the error of false output pulses. I do see the small oscillations, which are trap ringing. They disappear with
.options method=gear
.
t
@holger vogt: Any reason to expect that there would be any difference in behavior between nsgpice-43+ and ngspice-44.2?
m
@holger vogt can you reproduce with 42?
h
I don't see issues when TSTEP is 500n, TSTOP is 40u and looking at count[0]. However, setting TSTOP=400u, and looking at count[3], I see irregular pulses. Something to think about.
Setting option reltol to 1e-4 (10 x tighter than default) does help here to get rid of the false pulses. TSTEP tested from 10n to 500n, ngspice-44.2. I do not see any difference between ngspice-42 and ngspice-44.2.