Gcc 扩展asm返回_Gcc_Inline Assembly

Gcc 扩展asm返回

gcc

Gcc 扩展asm返回,gcc,inline-assembly,Gcc,Inline Assembly,如何实现写入指针目标而不是从函数调用返回值，如： /*no return function*/ static inline void mac(int64_t *output, in32_t a, int32_t b){ __asm__ ("madd %0,%0,%1,%2" //madd E[c],E[d],D[a],D[b] -> E[c]=E[d]+(D[a]*D[b]) /*OUTPUTS*/

如何实现写入指针目标而不是从函数调用返回值，如：

/*no return function*/
static inline void mac(int64_t *output, in32_t a, int32_t b){
    __asm__ ("madd %0,%0,%1,%2"       
             //madd E[c],E[d],D[a],D[b] -> E[c]=E[d]+(D[a]*D[b])

            /*OUTPUTS*/
            : "+m" (*output)          
            /*------------------------------------------------------------------
            m as "output" is a pointer to some memory?
            what constraint needs to be here? 
            extended register pair E[c],E[d] is needed in the madd instruction 
            -> c+=a*b is translated to madd %%e6,%%e6,%dX,%dY with X,Y some data
            registers....
            ------------------------------------------------------------------*/

            /*INPUTS*/
            : "r" (a), "r" (b)
            );
}

我在联机文档（）中找不到此内容

当我想要调试时，也会出现一个问题，那就是函数没有被编译（我无法进入），因为编译器可能认为函数中没有任何操作，他只是跳过了它？

因为写入内存

应该可以工作。我很困惑，因为你只提到登记册。在这种情况下，使用寄存器输出操作数，并在asm块后使用C将它们写入内存。您的代码似乎正确。你知道gcc知道如何将

*c+=a*（uint64_t）b

优化为

madd.u

，我假设

*c+=a*（int64_t）b

为有符号

madd

。为什么您仍要为此使用内联asm？在实际代码中是否有一些遗漏的优化？re：调试：如果结果未使用，它将优化掉。您可以执行

volatile int64\u t sink=tmp或打印结果以阻止编译器对其进行优化。（不要使用asm volatile
，您希望编译器能够在一般情况下进行优化。）这里只是一个示例，我实际上想使用madd.q，但不知何故，我没有让编译器使用此特殊指令。我只是以madd为例。谢谢你关于调试的提示！