Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/cplusplus/147.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C++ sse快速加载阵列的前半部分_C++_Sse - Fatal编程技术网

C++ sse快速加载阵列的前半部分

C++ sse快速加载阵列的前半部分,c++,sse,C++,Sse,假设我有一个大小为8的数组,其中填充了无符号int unsigned int t[8] 现在我想将每个元素的前16位加载到128位寄存器中: __m128i to_fill 有没有快速的方法可以做到这一点?而不是使用循环和掩蔽位 每个元素?您需要加载两个4 x 32位整数的向量,屏蔽每个元素的高16位,然后将它们打包成一个8 x 16位整数的向量 __m128i v_lo = _mm_loadu_si128((__m128i *)&t[0]); __m128i v_hi = _mm_

假设我有一个大小为8的数组,其中填充了无符号int

unsigned int t[8]
现在我想将每个元素的前16位加载到128位寄存器中:

__m128i to_fill
有没有快速的方法可以做到这一点?而不是使用循环和掩蔽位
每个元素?

您需要加载两个4 x 32位整数的向量,屏蔽每个元素的高16位,然后将它们打包成一个8 x 16位整数的向量

__m128i v_lo = _mm_loadu_si128((__m128i *)&t[0]);
__m128i v_hi = _mm_loadu_si128((__m128i *)&t[4]);
v_lo = _mm_and_si128(v_lo, _mm_set1_epi32(0xffff));
v_hi = _mm_and_si128(v_hi, _mm_set1_epi32(0xffff));
__m128i v = _mm_packs_epi32(v_lo, v_hi);

您需要加载两个4 x 32位整数的向量,屏蔽每个元素的高16位,然后将它们打包成一个8 x 16位整数的向量

__m128i v_lo = _mm_loadu_si128((__m128i *)&t[0]);
__m128i v_hi = _mm_loadu_si128((__m128i *)&t[4]);
v_lo = _mm_and_si128(v_lo, _mm_set1_epi32(0xffff));
v_hi = _mm_and_si128(v_hi, _mm_set1_epi32(0xffff));
__m128i v = _mm_packs_epi32(v_lo, v_hi);