Assembly 将浮点向量转换为uint32向量的最有效方法？_Assembly_Floating Point_Sse

Assembly 将浮点向量转换为uint32向量的最有效方法？

assembly floating-point

Assembly 将浮点向量转换为uint32向量的最有效方法？,assembly,floating-point,sse,Assembly,Floating Point,Sse,这是我们的后续问题。现在我想转换成相反方向的float-->unsignedint。以下标量运算的最佳和精确矢量序列是什么 float x = ... unsigned int res = (unsigned int)x; 这是基于旧但有用的Apple AltiVec SSE迁移文档中的一个示例，不幸的是，该文档现在已不再在以下位置提供：不是同一个问题！我想转换成一个无符号整数，你想用负号做什么？你说的“向量序列”是什么意思？x86上的sse？内部集成？sse汇编或sse内部集成都可以 i

这是我们的后续问题。现在我想转换成相反方向的float-->unsignedint。以下标量运算的最佳和精确矢量序列是什么

float x = ...
unsigned int res = (unsigned int)x;

这是基于旧但有用的Apple AltiVec SSE迁移文档中的一个示例，不幸的是，该文档现在已不再在以下位置提供：

不是同一个问题！我想转换成一个无符号整数，你想用负号做什么？你说的“向量序列”是什么意思？x86上的sse？内部集成？sse汇编或sse内部集成都可以

inline __m128i _mm_ctu_ps(const __m128 f)
{
    const __m128 two31 = _mm_set1_ps(0x1.0p31f);
    const __m128 two32 = _mm_add_ps(two31, two31);
    const __m128 zero = _mm_xor_ps(f,f);

    // check for overflow before conversion to int
    const __m128 overflow = _mm_cmpge_ps(f, two31);
    const __m128 overflow2 = _mm_cmpge_ps(f, two32);
    const __m128 subval = _mm_and_ps(overflow, two31);
    const __m128i addval = _mm_slli_epi32((__m128i)overflow, 31);
    __m128i result;

    // bias the value to signed space if it is >= 2**31
    f = _mm_sub_ps(f, subval);

    // clip at zero
    f = _mm_max_ps(f, zero);

    // convert to int with saturation
    result = _mm_cvtps_epi32(f); // rounding mode should be round to nearest

    // unbias
    result = _mm_add_epi32(result, addval);

    // patch up the overflow case
    result = _mm_or_si128(result, (__m128i)overflow2);

    return result;
}