i'm currently working on an approximation of the cosine. Since the final target device is a self-developement working with 32 bit floating point ALU / LU and there is a specialized compiler for C, I am not able to use the c library math functions (cosf,...). I'm aiming to code various methods that differ in terms of accuracy and number of instructions / cycles.
I've already tried a lot of different approximation algorithms, starting from fdlibm, taylor expansion, pade approximation, remez algorithm using maple and so on....
But as soon as I implement them using only float precision, there is a significant loss of precision. And be sure: I know that with double precision, a much higher precision is no problem at all...
Right now, i have some approximations which are exact up to a few thousand ulp around pi/2 (the range where the largest errors occur), and i feel that i am limited by the single precision conversions.
To address the topic argument reduction: input is in radian. i assume that an argument reduction will cause even more precision loss due to divisions / multiplications.... since my overall input range is only 0..pi, i decided to reduce the argument to 0..pi/2.
Therefore my question is: Does anybody know a single precision approximation to cosine function with high accuracy (and in the best case high efficiency)? Are there any algorithms that optimize approximations for single precision? Do you know whether the built-in cosf function computes the values with single oder double precision internally? ~
float ua_cos_v2(float x)
{
float output;
float myPi = 3.1415927410125732421875f;
if (x < 0) x = -x;
int quad = (int32_t)(x*0.63661977236f);//quad = x/(pi/2) = x*2/pi
if (x<1.58f && x> 1.57f) //exclude approximation around pi/2
{
output = -(x - 1.57079637050628662109375f) - 2.0e-12f*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f) + 0.16666667163372039794921875f*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f) + 2.0e-13f*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)+ 0.000198412701138295233249664306640625f*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f)*(x - 1.57079637050628662109375f);
output -= 4.37E-08f;
}
else {
float param_x;
int param_quad = -1;
switch (quad)
{
case 0:
param_x = x;
break;
case 1:
param_x = myPi - x;
param_quad = 1;
break;
case 2:
param_x = x - myPi;
break;
case 3:
param_x = 2 * myPi - x;
break;
}
float c1 = 1.0f,
c2 = -0.5f,
c3 = 0.0416666679084300994873046875f,
c4 = -0.001388888922519981861114501953125f,
c5 = 0.00002480158218531869351863861083984375f,
c6 = -2.75569362884198199026286602020263671875E-7f,
c7 = 2.08583283978214240050874650478363037109375E-9f,
c8 = -1.10807162057025010426514199934899806976318359375E-11f;
float _x2 = param_x * param_x;
output = c1 + _x2*(c2 + _x2*(c3 + _x2*(c4 + _x2*(c5 + _x2*(c6 + _x2*(c7
+ _x2* c8))))));
if (param_quad == 1 || param_quad == 0)
output = -output;
}
return output;
}
~
if I have forgotten any information, please do not hesitate to ask!
Thanks in advance