I'm curious whether people have CUDA capable GPUs ...
# general
m
I'm curious whether people have CUDA capable GPUs and would be able to take advantage of such features if they were added to OpenROAD. πŸ‘ for Yes πŸ‘Ž for No
πŸ‘Ž 5
πŸ‘ 12
l
Has OpenCL or another cross-vendor API been considered? πŸ™‚
m
Yes though you generally get somewhat lower performance with CL, eg https://arxiv.org/vc/arxiv/papers/1005/1005.2581v1.pdf
I guess we could ask how many have CL support w/o CUDA
l
@Matt Liberty Yes, unfortunately the peformance is somewhat worse than with NVIDIAs proprietary CUDA. Many of the open source OpenCL implementations were either missing features and/or not very performant. But there is a lot of great work happening right now: RustiCL (https://docs.mesa3d.org/rusticl.html) in Mesa is a rather recent development that already supports OpenCL 3.0. Together with the newly developed open source Vulkan driver NVK (https://docs.mesa3d.org/drivers/nvk.html) for NVIDIA GPUs, this would mean complete open source OpenCL 3.0 support for all three major GPU vendors (Intel, AMD and NVIDIA) for recent GPUs under Linux. And if you're not running Linux, you can still use the proprietary OpenCL drivers, even NVIDIA now supports OpenCL 3.0: https://developer.nvidia.com/opencl I think all OpenROAD features should be supported on all major platforms, even if it means a small performance penalty (for now). But this also gives a lot more flexibility in terms of what hardware you are running πŸ˜ƒοΈ https://www.phoronix.com/news/Rusticl-OpenCL-3.0-Conformance https://www.phoronix.com/news/Mesa-22.3-Released https://www.phoronix.com/news/Nouveau-NVK-One-Win
βœ… 1
m
kokkos seems like a nice library that can backend onto "CUDA, HIP, SYCL, HPX, OpenMP and C++ threads"
a
Is there a specific part of the flow that appears ripe for GPU acceleration?
βœ… 1
m
Global placement but I imagine many other parts eventually
πŸ‘ 2
a
Makes sense. What algorithm is used for that currently?
βœ… 1
There is already some work on both threads & gpu in progress.
However the gpu work used cuda
l
I haven't heard of kokkos before, but it sounds interesting. It would be interesting to see how it compares to OpenCL. Hopefully it's not lacking in features to be able to map to so many backends πŸ™‚οΈ
m
It seems a bit more like Sycl in terms of abstraction but without requiring DPC++
(I only learned of it yesterday)
l
I've taken some more time to look into the various APIs, and to me SYCL looks very promising. From what I have seen, SYCL does not necessarily depend on oneAPI DPC++. There are various implementations of SYCL with multiple backends. For example, AdapativeCpp (https://github.com/AdaptiveCpp/AdaptiveCpp) seems to be a quite nice one. Any SYCL implementation that can use OpenCL as a backend should then be compatible with RustiCL and thus provide open source support for all three major GPU vendors on Linux. Mind you, the support for that isn't quite there yet, but RustiCL seems to be making nice progress: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9061 Since SYCL itself has different implementations, for example AdaptiveCpp, it has support for a range of backends like CUDA, clang HIP, OpenMP and of course any SPIR-V compatible OpenCL devices. Thus, I don't think it is necessary to use another abstraction layer on top such as kokkos. What do you think? πŸ™‚οΈ
m
Its a possibility but requiring a special compiler makes me nervous (support, platform compatibility, etc)
l
Hm, yes that is in part true. AdaptiveCpp allows for library-only mode, but this is only supported for OpenMP and nvc++. To use the other backends one would need to use the clang-based flow. But I think Kokkos works in the same way. If you want to target a specific backend, you need to use a specific compiler: https://kokkos.org/kokkos-core-wiki/requirements.html Cuda for example can be compiled with nvcc/nvc++ or clang. OpenCL, on the other hand, is just a library and can be linked with any compiler. Difficult decisions πŸ˜…οΈ
m
At least with kokkos it seems that you don't have to use a special compiler if you just want CPU based methods. If you want more than you do have extra dependencies but at least they can be optional
l
Yes, that is really convenient, Kokkos uses OpenMP or C++ threads for CPU execution if I understand correctly. But I think this should also be possible with SYCL/AdaptiveCPP. The library-only mode allows for OpenMP, which compiles on all common compilers. If you target OpenCL directly, you can query a CL_DEVICE_TYPE_CPU device. In this case one would need to install PoCL or Intel OpenCL CPU Runtime as driver, which is less convenient. So with pure OpenCL one would probably need a software fallback. So many options, not easy to say which is the best now and in the future. But I and probably others appreciate it very much if OpenROAD doesn't target CUDA-only πŸ˜ƒοΈ