Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/c/57.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C 从8位复制到32位_C_Duplicates_Bit Manipulation_Expansion - Fatal编程技术网

C 从8位复制到32位

C 从8位复制到32位,c,duplicates,bit-manipulation,expansion,C,Duplicates,Bit Manipulation,Expansion,我正在尝试将8位值复制到32位,并想询问是否可以编写一个单行算法来复制位值 例如: 1100 1011 -> 1111 1111 0000 0000 1111 0000 1111 1111 如果可能的话,我想了解它背后的逻辑。只有256个8位值,因此一个简单的查找表将占用1kb的空间,而且查找过程非常简单。很难相信任何bithack都会有优异的性能。这会奏效: unsigned int eToTW (unsigned char a) { unsigned int output =

我正在尝试将8位值复制到32位,并想询问是否可以编写一个单行算法来复制位值

例如:

1100 1011 -> 1111 1111 0000 0000 1111 0000 1111 1111

如果可能的话,我想了解它背后的逻辑。

只有256个8位值,因此一个简单的查找表将占用1kb的空间,而且查找过程非常简单。很难相信任何bithack都会有优异的性能。

这会奏效:

unsigned int eToTW (unsigned char a) {
    unsigned int output = 0;

    output |= a & 0x80 ? ((unsigned) 0xf) << 28 : 0x0;
    output |= a & 0x40 ? 0xf << 24 : 0x0;
    output |= a & 0x20 ? 0xf << 20 : 0x0;
    output |= a & 0x10 ? 0xf << 16 : 0x0;

    output |= a & 0x8 ? 0xf << 12 : 0x0;
    output |= a & 0x4 ? 0xf << 8 : 0x0;
    output |= a & 0x2 ? 0xf << 4 : 0x0;
    output |= a & 0x1 ? 0xf : 0x0;

    return output;
}
unsigned int-eToTW(unsigned char a){
无符号整数输出=0;

输出|=a&0x80?((无符号)0xf)中建议的查找表将在大多数平台上提供最高的性能。如果您更喜欢位旋转方法,最佳解决方案将取决于处理器的硬件功能,例如移位速度有多快,它是否有三个输入逻辑操作(如我的GPU),它可以并行执行多少个整数指令?一种解决方案是将每个位传输到其目标半字节的lsb,然后在第二步中用其lsb值填充每个半字节(建议使用lsb而不是msb的提示):

#包括
uint32\u t将位扩展到半字节(uint8\u t x)
{
uint32_t r;
/*将位扩展到每个半字节中的lsb*/

r=((uint32_t)x很简单-先解决最简单的问题,然后再解决更复杂的问题

情况1:将1位复制为4位值(最简单)

这可以通过一组简单的班次来完成:

x = (x << 0) | (x << 1) | (x << 2) | (x << 3);
案例3:将4位复制成16位的值。如何操作?只需将2位移到上半部分,即可将其转换为案例1!分而治之

+---+---------+---------+---------+---------+
| 0 | _ _ _ _ | _ _ _ _ | _ _ _ _ | A B C D |
+---+---------+---------+---------+---------+
| 1 | _ _ _ _ | _ _ A B | _ _ _ _ | _ _ C D |
+---+---------+---------+---------+---------+
| 2 | _ _ _ A | _ _ _ B | _ _ _ C | _ _ _ D |
+---+---------+---------+---------+---------+
| 3 | A A A A | B B B B | C C C C | D D D D |
+---+---------+---------+---------+---------+
情况4:将8位复制为32位值(原始值)

可通过以下代码实现:

uint32_t interleave(uint8_t value)
{
    uint32_t x = value;
    x = (x | (x << 12)) /* & 0x000F000F */; // GCC is not able to remove redundant & here
    x = (x | (x <<  6)) & 0x03030303;
    x = (x | (x <<  3)) & 0x11111111;
    x = (x << 4) - x;
    return x;
}

换句话说,您的目标是将8位字节中的每一位转换为O(1)中的Nyble?答案是肯定的。该解决方案背后的逻辑就是不在其中放入换行符。
\u pdep\u u32
可用吗?您可以依赖任何处理器体系结构(例如x86)或指令集(例如BMI2)吗?我正试着把它写在PIC18微芯片(PIC18F46J50)上。@harold这是不可能的。这将是一条非常长的单行线:D明智的方式(第二轮正确阅读了问题)。或者可以简化为16个条目的查找表,并按字节进行工作。将有更多的“代码”当然。@eugenesh:由于目标设备似乎只有8位数据路径,没有任何形式的32位寄存器,我想一个带有2位索引的4项LUT可能是合适的。
0xf改进了最后一步(受@njuffa answer的启发)。聪明的代码!似乎可以进一步简化最后一步:
x=(x@StaceyGirl改进版在为Pascal系列GPU编译时,可以归结为12条指令,因为编译器能够在两个位置使用乘法加法:
LOP32I.和R0,R4,0xff;SHL R3,R0,0xc;LOP.OR R0,R3,R0;LOP32I.和R3,R0,0xc000c;SHL R3.LUT R0,R3,0x30003,R0,0xf8;SHL R3,R0,0x3;LOP3.LUT R0,R3,c[0x0][0x0],R0,0xc8;XMAD R5,R0.reuse,0x7,RZ;SHL R3,R0.reuse,0x3;XMAD.PSL R0,R0.H1,0x7,R5;LOP.OR R4,R0,R3;
@chqrlie修改后将Pascal系列GPU上的代码减少到十条指令:
LOP32I.和R0,R4,0xff;SHL R3,R0,0xc;LOP.OR R0,R3,R3,R3,R0,R0,R0,R0;LOP32I.和R3,R3,R3,R3,0x0,R3f8;SHL R3,R0,0x3;LOP3.LUT R0,R3,c[0x0][0x0],R0,0xc8;XMAD R3,R0.reuse,0xf,RZ;XMAD.PSL R4,R0.H1,0xf,R3;
@StaceyGirl:经过进一步分析,它似乎是
x=(x)中的掩码,因此位旋转将比使用查找表更糟糕,正如
x = (x << 0) | (x << 1) | (x << 2) | (x << 3);
x = (x << 4) - x;
+---+---------+---------+
| 0 | _ _ _ _ | _ _ A B |
+---+---------+---------+
| 1 | _ _ _ A | _ _ _ B |
+---+---------+---------+
| 2 | A A A A | B B B B |
+---+---------+---------+
+---+---------+---------+---------+---------+
| 0 | _ _ _ _ | _ _ _ _ | _ _ _ _ | A B C D |
+---+---------+---------+---------+---------+
| 1 | _ _ _ _ | _ _ A B | _ _ _ _ | _ _ C D |
+---+---------+---------+---------+---------+
| 2 | _ _ _ A | _ _ _ B | _ _ _ C | _ _ _ D |
+---+---------+---------+---------+---------+
| 3 | A A A A | B B B B | C C C C | D D D D |
+---+---------+---------+---------+---------+
+---+---------+---------+---------+---------+---------+---------+---------+---------+
| 0 | _ _ _ _ | _ _ _ _ | _ _ _ _ | _ _ _ _ | _ _ _ _ | _ _ _ _ | A B C D | E F G H |
+---+---------+---------+---------+---------+---------+---------+---------+---------+
| 1 | _ _ _ _ | _ _ _ _ | _ _ _ _ | A B C D | _ _ _ _ | _ _ _ _ | _ _ _ _ | E F G H |
+---+---------+---------+---------+---------+---------+---------+---------+---------+
| 2 | _ _ _ _ | _ _ A B | _ _ _ _ | _ _ C D | _ _ _ _ | _ _ E F | _ _ _ _ | _ _ G H |
+---+---------+---------+---------+---------+---------+---------+---------+---------+
| 3 | _ _ _ A | _ _ _ B | _ _ _ C | _ _ _ D | _ _ _ E | _ _ _ F | _ _ _ G | _ _ _ H |
+---+---------+---------+---------+---------+---------+---------+---------+---------+
| 4 | A A A A | B B B B | C C C C | D D D D | E E E E | F F F F | G G G G | H H H H |
+---+---------+---------+---------+---------+---------+---------+---------+---------+
uint32_t interleave(uint8_t value)
{
    uint32_t x = value;
    x = (x | (x << 12)) /* & 0x000F000F */; // GCC is not able to remove redundant & here
    x = (x | (x <<  6)) & 0x03030303;
    x = (x | (x <<  3)) & 0x11111111;
    x = (x << 4) - x;
    return x;
}
TEST_F(test, interleave)
{
    EXPECT_EQ(interleave(0x00), 0x00000000);
    EXPECT_EQ(interleave(0x11), 0x000F000F);
    EXPECT_EQ(interleave(0x22), 0x00F000F0);
    EXPECT_EQ(interleave(0x33), 0x00FF00FF);
    EXPECT_EQ(interleave(0x44), 0x0F000F00);
    EXPECT_EQ(interleave(0x55), 0x0F0F0F0F);
    EXPECT_EQ(interleave(0x66), 0x0FF00FF0);
    EXPECT_EQ(interleave(0x77), 0x0FFF0FFF);
    EXPECT_EQ(interleave(0x88), 0xF000F000);
    EXPECT_EQ(interleave(0x99), 0xF00FF00F);
    EXPECT_EQ(interleave(0xAA), 0xF0F0F0F0);
    EXPECT_EQ(interleave(0xBB), 0xF0FFF0FF);
    EXPECT_EQ(interleave(0xCC), 0xFF00FF00);
    EXPECT_EQ(interleave(0xDD), 0xFF0FFF0F);
    EXPECT_EQ(interleave(0xEE), 0xFFF0FFF0);
    EXPECT_EQ(interleave(0xFF), 0xFFFFFFFF);

    EXPECT_EQ(interleave(0x01), 0x0000000F);
    EXPECT_EQ(interleave(0x23), 0x00F000FF);
    EXPECT_EQ(interleave(0x45), 0x0F000F0F);
    EXPECT_EQ(interleave(0x67), 0x0FF00FFF);
    EXPECT_EQ(interleave(0x89), 0xF000F00F);
    EXPECT_EQ(interleave(0xAB), 0xF0F0F0FF);
    EXPECT_EQ(interleave(0xCD), 0xFF00FF0F);
    EXPECT_EQ(interleave(0xEF), 0xFFF0FFFF);
}