help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Rounding floats from 64bit to 32bit (double to single) with 0.5 rule


From: hale812
Subject: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule
Date: Sun, 25 Dec 2016 23:25:48 -0800 (PST)

Seems like single() function truncates IEEE 754 double float by simply
omitting irrelevant bits.

This however becomes a problem of error accumulation, when converting data
for 32bit DSP with a long path of computation.

For better results, the number should be rounded to Sgn1Exp8/Sig23 in binary
representation before truncating.

Is there a tool for Octave for rounded conversion to Single; or just binary
rounding(while maintaining irrelevant bits as zeroes in Double numbers) ?



--
View this message in context: 
http://octave.1599824.n4.nabble.com/Rounding-floats-from-64bit-to-32bit-double-to-single-with-0-5-rule-tp4681146.html
Sent from the Octave - General mailing list archive at Nabble.com.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]