在C中自动设置更高精度的浮点值_C_Numerical Methods_Floating Point Precision

在C中自动设置更高精度的浮点值

在C中自动设置更高精度的浮点值,c,numerical-methods,floating-point-precision,C,Numerical Methods,Floating Point Precision,我想自动将delta_prime设置为比typedef t_float更高的精度，以避免取消问题，以便用户可以在需要时更改t_float。也许我可以尝试获得浮点精度，但我不知道如何才能正确地做到这一点在我的typedefs.h中： typedef double t_float; 在某些代码c中： t_float solve_quadratic(const t_float a, const t_float b_prime, const t_float c) { long double

我想自动将delta_prime设置为比typedef t_float更高的精度，以避免取消问题，以便用户可以在需要时更改t_float。也许我可以尝试获得浮点精度，但我不知道如何才能正确地做到这一点

在我的typedefs.h中：

typedef double t_float;

在某些代码c中：

t_float solve_quadratic(const t_float a, const t_float b_prime, const t_float c)
{
    long double delta_prime = b_prime * b_prime - a * c;

    if (delta_prime < 0.0)
        return INF;

    return (b_prime + sqrt(delta_prime)) / a;
}

t_float solve_quadratic（常数t_float a，常数t_float b_prime，常数t_float c）
{
长双delta_素数=b_素数*b_素数-a*c；
if（delta_素数<0.0）
返回INF；
返回（b_素数+sqrt（delta_素数））/a；
}

您可以使用模板构建“类型映射”。Boost的MPL有一个“地图”来说明这一点，或者你可以自己做：

template <typename Numeric>
struct precision_enlarger;

template <>
struct precision_enlarger<float> { typedef double type; };

template <>
struct precision_enlarger<double> { typedef long double type; };

template <>
struct precision_enlarger<long double> { typedef long double type; };

模板
结构精密放大机；
模板
结构精度_放大器{typedef double type；}；
模板
结构精度_放大器{typedef long double type；}；
模板
结构精度_放大器{typedef long double type；}；

然后在代码中：

typename precision_enlarger<t_float>::type delta_prime = // ...

typename precision\u放大器：：type delta\u prime=/。。。

为了避免精度损失过大以及无法使用更高的精度，可以重新安排计算

const t_float a, const t_float b_prime, const t_float c

// The concern here is when `a*c` is positive, no cancellation when negative.
//long double delta_prime = b_prime * b_prime - a * c;

t_float ac = a*c;
t_float Discriminant;
t_float Discriminant_sr;
if (ac <= 0.0) {
  Discriminant = b_prime * b_prime - ac;
  // or Discriminant_sq = hypot(b_prime, sqrt(-ac));
}
else {
  ac = sqrt(ac);
  // When result approaches 0.0, half the precision loss v. b_prime * b_prime - a*c
  Discriminant = (b_prime - ac) * (b_prime + ac);
  // Note: Discriminant can only be negative via this path
}

// Assume + and - root are equally valid, so use the one that does not cause cancellation.
// b_prime + sqrt(Discriminant)) / a;
Discriminant_sr = sqrt(Discriminant);
t_float Root1;
if (b_prime >= 0.0) {
  Root1 = (b_prime + Discriminant_sr) / a;
else
  Root1 = (b_prime - Discriminant_sr) / a;
return Root1;

// If the other root is needed, it likely may be calculated from _something_ like 
Root2 = c/Root1.

const t_float a，const t_float b_prime，const t_float c
//这里需要考虑的是'a*c'为正数时，负数时不能取消。
//长双delta_素数=b_素数*b_素数-a*c；
t_float ac=a*c；
t_浮点判别式；
t_浮点数鉴别器_sr；
如果（ac=0.0）{
Root1=（b_素数+判别式_sr）/a；
其他的
Root1=（b_素-判别式_sr）/a；
返回根1；
//如果需要另一个根，那么它很可能是由类似的
Root2=c/Root1。

请参见。这并不能回答我的问题。在我的情况下，我希望避免取消b^2=ac…您是否考虑过类似的选项是否是可行的选择？

长双增量素数

可能有助于

sqrt（增量素数）<代码> >如果使用“代码”>“SQL RTL>”/“代码”，但除非你将每个操作数都放在左边，否则它不会改善取消事项。谢谢你的回答，我将查看MPFR。感谢QRTL，我在思考C++。我认为它根据P18的结尾有帮助。我错过了什么吗？谢谢。我会保留C++代码。我用C写。