开发者

efficient disk storage of decimal numbers in C (C89)

开发者 https://www.devze.com 2023-04-09 22:17 出处:网络
I am writing functions that serialize/deserialize a large data structure for efficient reloading later on.There is a particular set of decimal numbers for which precision is not a hug开发者_运维百科e

I am writing functions that serialize/deserialize a large data structure for efficient reloading later on. There is a particular set of decimal numbers for which precision is not a hug开发者_运维百科e deal, and I would like to store them in 4 bytes of binary data.

For most, reading the bytes into a buffer and using memcpy to place them into a float is sufficient, and is the most common solution I've found. However, this is not portable, as floats on the systems this software is meant for are not guaranteed to be 4 bytes in size.

What I would like is something very portable (which is one of the reasons I'm limited to C89). I'm not wedded to 4 byte storage, but it is an attractive option to me. I am pretty wholly against storing the numbers as strings. I'm familiar with endianness issues, and such things are already taken into account.

What I am looking for, therefore, is a system-independent way to store and retrieve floating point numbers in a small amount of binary data (preferably around 4 bytes). I, in my foolishness, imagined this would be the easiest part of this task, since it seems like such a common problem, but popular search engines and various reference books have provided no material assistance.


You could store them in 32 bit IEEE float format (or a very close approximation to it, for instance you might what to restrict denorms and NaNs). Then have each platform adjust as necessary to coerce its own float type to that format and back.

Of course there will be some loss of accuracy, but that's inevitable anyway if you're transferring float values of difference precisions from one system to another.

It should be possible to write portable code to find the closest IEEE value to a native float value, and vice-versa, if that's required. You wouldn't really want to use it, though, because it would probably be far less efficient than code that takes advantage of knowing the float format. In the common case where the platform uses an IEEE representation it's a no-op or a simple narrowing/widening conversion. Even in the worst case you're likely to encounter, as long as it's a binary fraction you basically just have to extract the sign, exponent and significand bits and do the right thing with them (discard bits from the significand if it's too big, adjust the bias and possibly the width of the exponent, do the right thing with underflow and overflow).

If you want to avoid losing accuracy in the case where the file is saved and then reloaded on the same system (but that system doesn't use 32bit IEEE), you could look at storing some data indicating the format in the file (size of each value, number of bits of significand and exponent), then store each value at native precision, so that it only gets rounded if it's ever loaded onto a less-precise system. I don't know whether ASN.1 has a standard to encode floating-point values along these lines, but it's the kind of complicated trickery I'd expect from it.


Check this out:http://steve.hollasch.net/cgindex/coding/portfloat.html

They give a routine which is portable and doesnt add too much overhead.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号