开发者

Floating point linear interpolation

开发者 https://www.devze.com 2023-01-28 03:22 出处:网络
To do a linear interpolation between two variables a and b given a fraction f, I\'m currently using this code:

To do a linear interpolation between two variables a and b given a fraction f, I'm currently using this code:

float lerp(float a, float b, float f) 
{
    retu开发者_运维问答rn (a * (1.0 - f)) + (b * f);
}

I think there's probably a more efficient way of doing it. I'm using a microcontroller without an FPU, so floating point operations are done in software. They are reasonably fast, but it's still something like 100 cycles to add or multiply.

Any suggestions?

n.b. for the sake of clarity in the equation in the code above, we can omit specifying 1.0 as an explicit floating-point literal.


As Jason C points out in the comments, the version you posted is most likely the best choice, due to its superior precision near the edge cases:

float lerp(float a, float b, float f)
{
    return a * (1.0 - f) + (b * f);
}

If we disregard from precision for a while, we can simplify the expression as follows:

    a(1 − f) × (ba)
 = aaf + bf
 = a + f(ba)

Which means we could write it like this:

float lerp(float a, float b, float f)
{
    return a + f * (b - a);
}

In this version we've gotten rid of one multiplication, but lost some precision.


Presuming floating-point math is available, the OP's algorithm is a good one and is always superior to the alternative a + f * (b - a) due to precision loss when a and b significantly differ in magnitude.

For example:

// OP's algorithm
float lint1 (float a, float b, float f) {
    return (a * (1.0f - f)) + (b * f);
}

// Algebraically simplified algorithm
float lint2 (float a, float b, float f) {
    return a + f * (b - a);
}

In that example, presuming 32-bit floats lint1(1.0e20, 1.0, 1.0) will correctly return 1.0, whereas lint2 will incorrectly return 0.0.

The majority of precision loss is in the addition and subtraction operators when the operands differ significantly in magnitude. In the above case, the culprits are the subtraction in b - a, and the addition in a + f * (b - a). The OP's algorithm does not suffer from this due to the components being completely multiplied before addition.


For the a=1e20, b=1 case, here is an example of differing results. Test program:

#include <stdio.h>
#include <math.h>

float lint1 (float a, float b, float f) {
    return (a * (1.0f - f)) + (b * f);
}

float lint2 (float a, float b, float f) {
    return a + f * (b - a);
}

int main () {
    const float a = 1.0e20;
    const float b = 1.0;
    int n;
    for (n = 0; n <= 1024; ++ n) {
        float f = (float)n / 1024.0f;
        float p1 = lint1(a, b, f);
        float p2 = lint2(a, b, f);
        if (p1 != p2) {
            printf("%i %.6f %f %f %.6e\n", n, f, p1, p2, p2 - p1);
        }
    }
    return 0;
}

Output, slightly adjusted for formatting:

    f            lint1               lint2             lint2-lint1
0.828125  17187500894208393216  17187499794696765440  -1.099512e+12
0.890625  10937500768952909824  10937499669441282048  -1.099512e+12
0.914062   8593750447104196608   8593749897348382720  -5.497558e+11
0.945312   5468750384476454912   5468749834720641024  -5.497558e+11
0.957031   4296875223552098304   4296874948674191360  -2.748779e+11
0.972656   2734375192238227456   2734374917360320512  -2.748779e+11
0.978516   2148437611776049152   2148437474337095680  -1.374390e+11
0.986328   1367187596119113728   1367187458680160256  -1.374390e+11
0.989258   1074218805888024576   1074218737168547840  -6.871948e+10
0.993164    683593798059556864    683593729340080128  -6.871948e+10
1.000000                     1                     0  -1.000000e+00


If you are on a micro-controller without an FPU then floating point is going to be very expensive. Could easily be twenty times slower for a floating point operation. The fastest solution is to just do all the math using integers.

The number of places after the fixed binary point (http://blog.credland.net/2013/09/binary-fixed-point-explanation.html?q=fixed+binary+point) is: XY_TABLE_FRAC_BITS.

Here's a function I use:

inline uint16_t unsignedInterpolate(uint16_t a, uint16_t b, uint16_t position) {
    uint32_t r1;
    uint16_t r2;

    /* 
     * Only one multiply, and one divide/shift right.  Shame about having to
     * cast to long int and back again.
     */

    r1 = (uint32_t) position * (b-a);
    r2 = (r1 >> XY_TABLE_FRAC_BITS) + a;
    return r2;    
}

With the function inlined it should be approx. 10-20 cycles.

If you've got a 32-bit micro-controller you'll be able to use bigger integers and get larger numbers or more accuracy without compromising performance. This function was used on a 16-bit system.


If you're coding for a microcontroller without floating-point operations, then it's better not to use floating-point numbers at all, and to use fixed-point arithmetic instead.


Since C++20 you can use std::lerp(), which is likely to be the best possible implementation for your target.


It is worth to note, that the standard linear interpolation formulas f1(t)=a+t(b-a), f2(t)=b-(b-a)(1-t), and f3(t)=a(1-t)+bt do not guarantee to be well-behaved when using floating point arithmetic. Namely, if a != b, it is not guaranteed that the f1(1.0) == b or that f2(0.0) == a, while for a == b, f3(t) is not guaranteed to be equal to a, when 0 < t < 1.

This function has worked for me on processors that support IEEE754 floating point when I need the results to behave well and to hit the endpoints exactly (I use it with double precision, but float should work as well):

double lerp(double a, double b, double t) 
{
    if (t <= 0.5)
        return a+(b-a)*t;
    else
        return b-(b-a)*(1.0-t);
}


If you want to the final result to be an integer, it might be faster to use integers for the input as well.

int lerp_int(int a, int b, float f)
{
    //float diff = (float)(b-a);
    //float frac = f*diff;
    //return a + (int)frac;
    return a + (int)(f * (float)(b-a));
}

This does two casts and one float multiply. If a cast is faster than a float add/subtract on your platform, and if an integer answer is useful to you, this might be a reasonable alternative.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号