So how is Xyce vs ngspice ? I know the former is a pain to build, but is it much faster ?
t
So how is Xyce vs ngspice ? I know the former is a pain to build, but is it much faster ?
t
Xyce is designed to multiprocess the computations, so it has the potential to be very fast on systems with lots of CPU cores. ngspice also has a setting for multiple cores. I have never done a head-to-head test. It would be a very useful experiment.
Xyce is not nearly as feature-rich as ngspice. It is missing anything like ngspice's control block and interactive command set. It has no xspice event simulator, although it has some components that are like xspice primitives. One thing it does have is a (documented!) method for co-simulation with iverilog.
s
Performance-wise results are mixed (talking about transient sims and Ngspice V41+ vs serial build Xyce 7.7+), on a core i7 Devuan Linux laptop (this is what I have!) Simulation of a 8kx16bit ROM (14k MOS) on a simple 1.2V cmos process (not sky130) took: • 245 sec. with Ngspice • 337 sec. with Xyce Simulation of a sky130 testbench with some 256 bit adders (CLA vs ripple carry, 23k MOS) took: • 1075 sec. with Ngspice • 750 sec. with Xyce For the functionality I full agree with @Tim Edwards comments. Doing some analyses is much easier with Ngspice given its (far from optimal but it works!) Ngspice Control Language, allowing to loop simulation runs, changing parameters etc.
t
Thanks both for the infos ! I'll stick with ngspice for now just for ease of install. Just wanted to make sure Xyce wasn't like 10x faster or something šŸ˜…
s
Xyce is definitely not 10x faster (of course talking about jobs on a single machine, I have not the resources to do a large scale LSF test on a datacenter cluster), normally the time difference is within 40% and Xyce is not always the winner. Seems each simulator has its preferred circuits. Generated outputs seem to me nearly identical.
f
Is setting the 'NUM_THREADS' parameter in the '.spiceinit' file to utilize multiple CPU cores the only configuration needed in NGSpice for parallelizing simulations, because when I watch the CPU load during simulations, I notice that only one core is at 100% utilization. Are there any other considerations or settings to keep in mind when attempting multi-threaded simulations, or is this behavior expected?
s
Ngpice can use one core for solving the system of equations (the solver) and one or more threads for calculating the device equations (if
set num_threads=n
is set in .spiceinit) In this example below ngspice is using 4 threads (358% CPU). Ngspice can also parallelize the solver, as explained in section 19.6 of the manual. This is not a simple setup however.
šŸ‘ 1
f
I set num_threads=4, but in my case CPU% is always less than 100% (around 80%). Why?
t
I'm guessing computing the equations is only done at the beginning. So it might be that for your sim, it goes quick to do that and you don't see it using multiple cores and then the solving part is the time consuming one. Just a theory ...
s
@tnt the device equations (BSIM4 models for example) need to be calculated at each iteraion. Depending on voltage and currents the capacitances, transconductances, many other nonlinear parameters are updated. On the next iteration the operating point changes and equations need to be evaluated again. @Filippo some device models can not be parallelized, for example a design using simpler mosfet models (like BSIM3) does never go above 100% on my system, while a sky130 design using many (10000+) BSIM4 mosfets does split device equation evaluation on different threads and uses up to 4 threads. Setting
num_threads
above 4 does in any case not give any benefit, as far as I can tell.
šŸ‘ 2
f
@Stefan Schippers this is my spiceinit file and I am using sky130 components in my design, but I don't understand why my CPU % usage is maximum 100%.
Copy code
set num_threads=4
set ngbehavior=hsa
set ng_nomodcheck
set enable_noisy_r
I am using iic-osic-tool docker container, can be this an element that limits the CPU usage?
h
In several machines with MS Windows I have found that the optimum n is the number of the physical cores. The current ngspice developments heads towards a further speed-up of the simulation. The major enhancement is the integration of a KLU matrix solver, which speeds up the circuit setup as well as the simulation. When running the venerable c7552_ann test chip (15k transistors BSIM4.5, 48k C, and 33k R) with Skywater130 models, the overall simulation speed-up is a factor of 2. Setup-time (loading and parsing the circuit is reduced by factor 1.5, transient simulation time by a factor of 3). The new code also allows to simulate circuits with 200k transistors If you are willing to compile ngspice yourself, you may try out the code at the git development branch pre-master-42 at https://sourceforge.net/p/ngspice/ngspice/ci/pre-master-42/tree/. On the ngspice application page https://ngspice.sourceforge.io/applic.html you may find the ngspice skywater example, which also includes the comparison data. I would be happy to get some feedback from the user side.
s
Thank you @holger vogt, this seems very interesting. I will certainly do some tests with the new pre-master.
@holger vogt I have updated Ngspice to latest pre-master-42, > Simulation of a 8kx16bit ROM (14k MOS) on a simple 1.2V cmos process (not sky130) took: > • 245 sec. with Ngspice > • 337 sec. with Xyce This test case takes now
203 seconds
with ngspice (same result in 2 runs) > Simulation of a sky130 testbench with some 256 bit adders (CLA vs ripple carry, 23k MOS) took: > • 1075 sec. with Ngspice > • 750 sec. with Xyce This testbench is now solved by ngspice in
1200 seconds
So I see an improvement in one case, not in the other. Doing another run of the latest testcase --> confirmed
1203 seconds
Xyce test run again, done in 650 sec. Simulation results are accurate and correct. All tests done on a Core i7 laptop with Devuan Linux. In latest test Xyce did save all voltages (I did not want it to): Ngspice:
Copy code
Raw file data read: /home/schippes/.xschem/simulations/test_carry_lookahead_ngspice.raw
points=1353, vars=1381, datasets=1 sim_type=tran
Xyce:
Copy code
Raw file data read: /home/schippes/.xschem/simulations/test_carry_lookahead_xyce.spice.raw
points=1448, vars=10799, datasets=1 sim_type=tran
So I guess Xyce could do somewhat better if saving only the top level voltages I am interested in, although i don't think there are I/O bottlenecks on these examples.
385 Views