Channels

#openpositarithmetic

- a
Art Scott

05/02/2022, 9:50 PMset the channel topic: Discussion channel Open Posit Standard Document (2022) Posithub.org - a
Art Scott

05/08/2022, 1:41 PM - a
Art Scott

05/08/2022, 10:22 PMFrom John Gustafson -- '...Ananth Kinnal, founder and CEO of CalligoTech in Bangalore, tells me they are making a multicore RISC-V chip in VLSI (not an FPGA) and will have 100 samples by the end of the year. Some of the cores use floats, others use posits, making it quite easy to choose which format you want to experiment with… but the speed should be very comparable to current-generation microprocessors for both formats.https://calligotech.com/products/?tab=activity

Raul Murillo achieved significant reductions in the chip area and total logic delay of posit functional units, presented at CoNGA 2022, by exploiting the simplifications that 2's complement format provides (like Yonemoto did, five years ago). Further optimizations are possible, especially if the operations are not required to execute in constant time....' - a
Art Scott

05/09/2022, 1:13 PMhttps://github.com/arunkmv/perc Posit Enhanced Rocket Chip This is a fork of the Rocket Chip Soc Generator that provides support for posit arithmetic. This has been enabled by utilizing the F and D standard RISC V ISA extensions for floating point arithmetic. The instructions are implemented by the Posit Processing Unit (PPU)/PositFPU which replaces the inhouse IEEE 754 2008 FPU. For more information on posit arithmetic: Posit Hub Unum-computing Google group For more detailed information, please refer to our paper. - t
Tim Edwards

05/09/2022, 7:07 PM**@Tim Edwards**has left the channel - a
Art Scott

06/09/2022, 2:42 PMWeather and climate models are based on 64bit double precision floating-point arithmetic. Recent studies show that floats with only 32bit (single precision) or even less allow for accurate weather forecasts. However, floats might not be the best bit-wise representation of real numbers in the wider field of computational fluid dynamics. Posit numbers are a recently proposed alternative floats and although no standardised posit processor exists yet, we emulate posit arithmetic with a Julia-based emulator on a conventional CPU. A medium complexity atmosphere/ocean circulation model written in Julia is shown to be as accurate with 16bit posits as with 64bit floats. The results are promising for weather and climate models being largely based on 16bit computations with a great potential speed-up on specialised hardware in the future. - a
Art Scott

06/27/2022, 3:45 PMThe goal of the Universal Numbers Library is to offer applications alternatives to IEEE floating-point that are more efficient and mathematically robust. The motivation to find improvements to IEEE floating-point had been brewing in the HPC community since the late 90's as most algorithms became memory bound and computational scientists were looking for alternatives that provided more granularity in precision and dynamic range. Even though the inefficiency of IEEE floating-point had been measured and agreed upon in the HPC community, it was the commercial demands of Deep Learning that provided the incentive to replace IEEE-754 with alternatives, such as half-floats, and bfloats. These alternatives are tailored to the application and yield speed-ups of two to three orders of magnitude, making rapid innovation in AI possible. The Universal library is a ready-to-use header-only library that provides plug-in replacement for native types, and provides a low-friction environment to start exploring alternatives to IEEE floating-point in your own algorithms. https://github.com/stillwater-sc/universal - a
Art Scott

06/28/2022, 4:20 PMScreen Shot 2022-06-28 at 9.19.17 AM.png - a
Art Scott

02/19/2023, 2:30 PMTo view this discussion on the web visit https://groups.google.com/d/msgid/unum-computing/7FBE2B07-320B-45B9-AA87-66CF3D62C5E9%40earthlink.net.

- a
Art Scott

02/19/2023, 2:32 PMSubject: Re: Need help figuring out a way to convert IEEE 754 representation to Posit To: Unum Computing <unum-computing@googlegroups.com> Note: This is untested and "just a guess". On Saturday, February 18, 2023 at 44105 PM UTC-6 johngustafson wrote: My approach in converting a float to a posit is to first separate the sign from the magnitude (note the sign, then set the bit to zero); then check if the magnitude is zero, in which case return posit zero (all 0 bits). I would separate into S, E, and 1.F (I am a HW guy so this is the way I think) Access a table[E] containing {pE, pEs, pF} pE is the posit regime+exponent, pF is a shift applied to 1.F to put it in the proper position in the posit. pFr = pF > 0 : 1.F >> pF /*you could add rounding on the bits falling off the end here */ : 1.F << -pF; Now:: p = {S<<63 | pE << pEs | pFr}; is the posit when no boundary conditions are encountered. Now we make the table values handle certain special cases:: E = all 1s -> {NaN or Infinity} pF = 63 (thus the shift will eliminate all 1.F bits without making shifters on various architectures cranky) E = all 0s -> denormalized {and I don't remember what posits do with denorms} The table[2048] is easy to construct. Using E directly as an index to the table eliminates having to understand the bias, it is just mapping E to pE<<pEs. If you are going to have to consume (blow) a doubleword for regime+exponent then you can eliminate the shift (saving an instruction (below)). Milan Kloewer's use of posits has in a very restricted range, so his table[E] will hardly damage his cache's data footprint. He may ignore infinities and underflow conditions due to his restricted range. In my ISA, this is 14 instructions (less rounding). Best, John Mitch - a
Art Scott

03/11/2023, 1:55 PMhttps://www.comp.nus.edu.sg/~wongwf/papers/CONGA23-Bedot.pdf Posted in #general - a
Art Scott

04/01/2023, 1:04 PM - a
Art Scott

05/02/2023, 5:46 PM - a
Art Scott

05/07/2023, 3:11 PMFrom:**方超**<fantasysee@gmail.com> Date: Sun, May 7, 2023, 2:07 AM Subject: Open-Source Posit Dot-Product Unit (PDPU) for Deep Learning Applications: A Presentation at ISCAS 2023 To: Unum Computing <unum-computing@googlegroups.com> Dear colleagues, We are excited to share with you our latest work on "*PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications*", which will be presented at the**2023**IEEE International Symposium on Circuits and Systems (**ISCAS)**on**Tuesday, May 23rd**. Our research team at**Nanjing University**, consisting of**Qiong Li, Chao Fang, and Zhongfeng Wang**, will be presenting this exciting project in the**Data Path & Arithmetic Circuits & Systems Session**. You can find our paper and slides through the provided links. PDPU is a highly efficient hardware module that performs the dot-product operation of two input vectors Va and Vb in low-precision format, accumulating the result and previous output acc to a high-precision value out. This allows for more efficient computation and improved performance for deep learning applications, which often involve a large number of dot-product operations. Our proposed PDPU comes with several features and contributions.**Firstly**, it implements efficient dot-product operations with fused and mixed-precision properties. This leads to significant reductions in area, latency, and power consumption, up to 43%, 64%, and 70% respectively, compared to discrete architectures.**Secondly**, it is equipped with a fine-grained 6-stage pipeline that minimizes the critical path and improves computational efficiency. The structure of PDPU is detailed by breaking down the latency and resources of each stage.**Lastly**, a configurable PDPU generator is developed to enable PDPU to flexibly support various posit data types, dot-product sizes, and alignment widths. We believe that PDPU can make a significant contribution to the field of posit arithmetic units and deep learning. We encourage you to explore the provided links to learn more about our research and to use and contribute to the open-source code.**If you have any questions or comments, please feel free to reach out to us.**Best regards, Chao Fang - a
Art Scott

06/27/2023, 3:39 PMYes, multiple posit additions can be done in the*quire*, a wide fixed-point register, with no rounding error until you convert the quire value back into posit form. This restores the associative property of addition, for situations where that is desirable. For 8-bit (standard) posits, the quire is 128 bits long so you can sum up to about 3.6e16 numbers with no possibility of overflow. (and they would all have to to be the maximum posit value, 2^24, to overflow). For 16-bit posits, you can sum up to about 1.5e26 numbers with no possibility of overflow. You might be able to hit that maximum with the fastest existing supercomputers and a lot of patience, but otherwise you're safe to consider it impossible to overflow. The quire is 256 bits long for 16-bit posits. For 32-bit posits, you'd have to add the largest posit to a sum over 2.9e45 times before there is any way to overflow. Now we're talking about numbers you can't reach even if every atom on Earth is also a computer. The quire is 512 bits long, the same sie as a cache line on many microprocessors. - a
Art Scott

08/11/2023, 10:59 AM[PDF] FPPU: Design and Implementation of a Pipelined Full Posit Processing Unit https://arxiv.org/pdf/2308.03425.pdf RISC-V customization of instruction set with posit\textsuperscript Deep positron: A deep neural network using the posit - a
Art Scott

08/22/2023, 11:02 PMhttps://arxiv.org/pdf/2305.06946.pdf May 2023 Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing David Mallase ́n, Alberto A. Del Barrio, _Senior Member, IEEE _and Manuel Prieto-Matias *Abstract*—The accuracy requirements in many scientific computing workloads result in the use of double-precision floating-point arithmetic in the execution kernels. Nevertheless, emerging real-number representations, such as posit arithmetic, show promise in delivering even higher accuracy in such computations. In this work, we explore the native use of 64-bit posits in a series of numerical benchmarks extracted from the PolyBench collection and compare their timing performance, accuracy and hardware cost to IEEE 754 doubles. For this, we extend the PERCIVAL RISC-V core and the Xposit custom RISC-V extension with posit64 and quire operations. Results show that posit64 can execute as fast as doubles, while also obtaining up to 4 orders of magnitude lower mean square error and up to 3 orders of magnitude lower maximum absolute error. However, leveraging the quire accumulator register can limit the order of some operations such as matrix multiplications. Furthermore, detailed FPGA synthesis results highlight the significant hardware cost of 64-bit posit arithmetic and quire. Despite this, the large accuracy improvements achieved with the same memory bandwidth suggest that posit arithmetic may provide a potential alternative representation for scientific computing. *Index Terms*—Arithmetic, Posit, IEEE-754, Floating point, Scientific computing, RISC-V, CPU, Matrix multiplication, PolyBench. - a
Art Scott

08/26/2023, 1:10 PMhttps://scholar.google.com/scholar_url?url=https://ieeexplore.ieee.org/iel7/8920/43586[…]617074942:AFWwaebCHUTRG4EX3de1okCPOjJs&html=&pos=0&folt=cit [PDF] HUB Meets Posit: Arithmetic Units Implementation R Murillo, J Hormigo, AA Del Barrio, G Botella - … on Circuits and Systems II: Express …, 2023 The positformat was introduced in 2017 as an alternative to replacing the widespread IEEE 754. Posit arithmetic provides reproducible results across platforms and possesses tapered accuracy, among other improvements. Nevertheless, despite the advantages provided by such a format, their functional units are not as competitive as the IEEE 754 ones yet. The HUB approach was presented in 2016 to reduce the hardware cost of floating-point units. In this brief, we present HUB posit, a … • Cites: Next generation arithmetic for edge computing - a
Art Scott

09/08/2023, 10:09 AMYou received this message because you are subscribed to the Google Groups "Unum Computing" group. To unsubscribe from this group and stop receiving emails from it, send an email to unum-computing+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/unum-computing/b56f2032-9fba-4742-a51d-0852bf93cc75n%40googlegroups.com. "...While currently reserved, the RISC-V V extension has encodings for Selected Element Width and Extended Memory Element Width that seem to be intended to eventually allow 128, 256, 512, and 1024-bit elements in the vector registers. As a bonus, the encoding for the extended element widths can have the lower bits be exactly the same as the width of the posits they correspond to. Combine these with the usual integer operations and most of what you need to support a quire is already available with the existing instructions. A minor kink is that currently the Zve* extensions explicitly require that each vector register be at least as wide as the widest element it supports, but given the extended element widths, that can probably be fudged. That just leaves instructions to convert to and from a quire with an extended element width, maybe even a mul_add_reduce_into_quire instruction, but that seems a bit too ambitious. If the posits themselves are stored in the vector registers as well, then the usual comparisons, negations, and shifts become available. vnclip is perfect for converting wider posits into smaller ones thanks to the Fixed-Point-Rounding Mode (vxrm) register allowing the round-to-nearest-even behavior, and the opposite direction can use vwsll, but that's only available in Zvbb rather than the usual vector extensions. - a
Art Scott

09/12/2023, 1:54 AMGitHub - HPC-Lab-IITB/Clarinet: A RISC-V processor written in BSV, based on the Flute core. Has support for integrating tightly-coupled accelerators, and for integrating custom functional units like posit arithmetic units. https://github.com/HPC-Lab-IITB/Clarinet - a
Art Scott

09/28/2023, 10:36 AMhttps://github.com/interplanetary-robot/Verilog.jl On the Wire type. This is a datatype that represents an indexed array of 3-value logic (1,0,X). To make your stuff look more verilog-ey, the "v" suffix for unit ranges is provided. Eg: •

is roughly equivalent to`Wire{6:0v}`

•`wire [6:0]`

is roughly equivalent to`Wire{12:1v}`

• SingleWire is aliased to`wire [12:1]`

, roughly equivalent to`Wire{0:0v}`

On wire arrays Do the natural thing and use the non-initializing`wire`

constructor from Julia. Note that when transpiling to verilog, the wire arrays will be down-shifted by one to make them one-indexed (this feature may change in the future!). Currently, binary muxes are not supported. Gotchas: • Assigning to wire array member partials is not allowed: • my_array = Array{Wire{1:0v},1}(6) • my_array[1] = Wire(0b11,2) # <== this is OK. • my_array[2][1:0v] = Wire(0b11,2) # <== don't do this. • • Assigning wire array member partials with a function is not allowed: • my_array[3][1:0v] = some_verilog_module(some_input) # <== don't do this.`Array{Wire{<descriptor>}}(n)`

- a
Angelo Bulfone

10/21/2023, 1:28 AMI've managed to perform operations without taking the absolute value of the fraction, but I'm still xoring the exponents, only to xor them back when encoding. Since

=`a xor (0 | -1)`

(where`a * (1 | -1) + (0 | -1)`

and`(0 | -1)`

denote the format of the sign for the operation), are there any algebraic manipulations that could be done to eliminate them?`(1 | -1)`

🌍 1 - a
Angelo Bulfone

10/21/2023, 1:34 AMI was able to explode

(from the multiplication logic) to`(exp_x ^ sign_x(0 | -1) + exp_y ^ sign_y(0 | -1) + correction) ^ sign_res(0 | -1)`

, but I'm not sure if it can be simplified back down to something less than the original.`(exp_x * sign_y(1 | -1)) + (exp_y * sign_x(1 | -1)) + (sign_x(0 | 1) * sign_y(1 | -1)) + (sign_y(0 | 1) * sign_x(1 | -1)) + (correction * sign_res(1 | -1)) + sign_res(0 | -1)`

🌍 1 - m
Mohammed Fayiz Ferosh

10/23/2023, 2:05 AM**@Mohammed Fayiz Ferosh**has left the channel - a
Angelo Bulfone

11/06/2023, 12:54 AMWhile likely not viable in software, it seem to absolutely be possible to optimize hardware to use fewer xors in computing the exponent of a multiplication. While it doesn't help with carries, addition is largely xors, which is commutative and associative, so

can reduce to`s xor a xor b xor c xor s`

. I don't know how FPGAs work, but they seem to use LUTs more than primitive logic gates, so maybe this doesn't actually help in that context.`a xor b xor c`

- a
Angelo Bulfone

11/06/2023, 12:56 AMFor the a single bit of output exponent, it's usually basically

.`(a xor s) xor (b xor s) xor (c xor s) xor s = a xor b xor c xor s xor s xor s xor s = a xor b xor c`

- a
Angelo Bulfone

11/06/2023, 2:20 AMNevermind, Forgot that each input had an independent sign. - a
Angelo Bulfone

11/06/2023, 2:59 AMIn a multiplier, instead of xoring each exponent before and after the computation, an alternative is to xor with the opposite input, add the correction (also without xoring), and then add 2 if the input signs were different (aka output number is negative). Doesn't seem to be efficient in software, but it could potentially have uses in hardware.🌍 1 - a
Art Scott

11/12/2023, 2:10 PMNathan Waivio, Haskell Posit library, https://github.com/waivio/posit/pull/14 https://hackage.haskell.org/package/posit To view this discussion on the web visit https://groups.google.com/d/msgid/unum-computing/26e67109-d38e-4864-96d7-606f61f3fca5n%40googlegroups.com. - a
Art Scott

11/12/2023, 2:26 PM