C 有没有标准的方法来重新构造降低的结构函数参数？_C_Clang_Llvm_Llvm Ir_Abi

C 有没有标准的方法来重新构造降低的结构函数参数？

c clang llvm

C 有没有标准的方法来重新构造降低的结构函数参数？,c,clang,llvm,llvm-ir,abi,C,Clang,Llvm,Llvm Ir,Abi,我有一个结构类型： typedef struct boundptr { uint8_t *ptr; size_t size; } boundptr; 我想捕获该类型函数的所有参数。例如，在该功能中： boundptr sample_function_stub(boundptr lp, boundptr lp2); 在我的64位机器上，Clang将该签名转换为： define { i8*, i64 } @sample_function_stub(i8* %lp.coerce0, i64

我有一个结构类型：

typedef struct boundptr {
  uint8_t *ptr;
  size_t size;
} boundptr;

我想捕获该类型函数的所有参数。例如，在该功能中：

boundptr sample_function_stub(boundptr lp, boundptr lp2);

在我的64位机器上，Clang将该签名转换为：

define { i8*, i64 } @sample_function_stub(i8* %lp.coerce0, i64 %lp.coerce1, i8* %lp2.coerce0, i64 %lp2.coerce1) #0 {

问题: 有没有更好的方法来重建这样的论点

在保持外部调用的ABI不变的情况下，是否可以禁止这种参数降低

更多背景：所以在LLVM IR中，我猜，根据平台ABI，编译器将结构分解为单独的字段（这不是最坏的情况，请参阅）。顺便说一句，它在函数体的后面重建了原来的两个参数

lp

和

lp2

现在，为了进行分析，我想从这4个参数（

lp.concure0

，

lp.concure1

，

lp2.concure0

和

lp2.concure1

）中完整地获得这两个参数

lp

和

lp2.concure1

）。在这种情况下，我可能可以依赖这些名称（

.compresse0

表示第一个字段，

.compresse1

-second）

我不喜欢这种方法：

我不确定，这种叮当声是否会在以后的版本中保留此约定
这当然取决于ABI，因此在另一个平台上可能会出现不同的故障

另一方面，我不能在函数的开头使用重构代码，因为我可能会将其与局部变量的某些用户代码混淆

我使用基于LLVM的Clang

3.4.2

作为目标

x86\u 64-pc-linux-gnu

这里是一个例子，说明了叮当声会把函数参数搞得一团糟。

我想你不是在用

O0

编译。好吧，当您没有优化代码时，clang将重新组装原始类型。Clang分解结构，将它们通过寄存器（至少在x86上）传递给被调用的函数。正如你所说，这取决于使用的ABI

下面是一个来自您的用例的虚拟示例：

#include <cstddef>

typedef struct boundptr {
  void *ptr;
  size_t size;
} boundptr;

boundptr foo(boundptr ptr1, boundptr ptr2) { return {ptr1.ptr, ptr2.size}; }

int main() {
  boundptr p1, p2;
  boundptr p3 = foo(p1, p2);
  return 0;
}

boundptr

在被调用函数堆栈上重建（这也取决于使用的调用约定）

现在，要了解哪些

boundptr

是您的参数，您可以执行以下操作：

访问通行证中的每个

alloca

inst并跟踪其用户

按照

alloca

的类型转换以及GEP说明查找将说明存储在

边界PTR

上

检查要存储的值。如果它们是您的函数参数，并且匹配类型和名称，则可以找到重新组合的

boundptr

当然，您可以从函数参数开始，用另一种方法来做

这是未来的证明吗？不，绝对不是。Clang/LLVM的设计目的不是保持向后兼容性。对于兼容性，ABI很重要

缺点：您必须在代码生成之后很早就进入优化器。即使

也会删除

boundptr

的这些堆栈分配。因此，您必须修改您的

clang

以在优化过程中执行您的过程，并且您不能将其作为独立过程（例如，

opt

使用）

更好的解决方案：

由于必须以某种方式修改clang，因此可以添加元数据来标识

boundptr

类型的参数。因此，您可以将

boundptr

的片段“打包”在一起，以将它们标识为

boundptr

。这将在优化器中生存下来

您的实际问题是什么？？？@Olaf，我需要在llvm IR解释期间转储此地址存储的值，并且在处理“

void*

polymorphics”时，我需要知道该值的大小。您对“

void*

polymorphics”有什么意思？抱歉，但仍不清楚您为什么需要它。这个问题有点像XY问题。顺便说一句，IR并没有保留复合数据类型的所有信息——为什么要保留呢，因为它是一种更抽象的汇编语言。我的意思是：我有一系列函数，它们在指向未指定类型的指针上操作。我解释LLVM IR，需要转储它们传递的值（内存块）。IR保留了数据类型定义，在本例中是：

%struct.boundptr=type{i8*，i64}

。。。了解大小的一种方法是系统地将其与每个指针一起传递。为此，我有一个

boundptr

结构，但现在Clang将其分解。我想解决这个问题。

define { i8*, i64 } @_Z3foo8boundptrS_(i8* %ptr1.coerce0, i64 %ptr1.coerce1, i8* %ptr2.coerce0, i64 %ptr2.coerce1) #0 {
  %1 = alloca %struct.boundptr, align 8
  %ptr1 = alloca %struct.boundptr, align 8
  %ptr2 = alloca %struct.boundptr, align 8
  %2 = bitcast %struct.boundptr* %ptr1 to { i8*, i64 }*
  %3 = getelementptr { i8*, i64 }, { i8*, i64 }* %2, i32 0, i32 0
  store i8** %ptr1.coerce0, i8** %3
  %4 = getelementptr { i8*, i64 }, { i8*, i64 }* %2, i32 0, i32 1
  store i64 %ptr1.coerce1, i64* %4
  %5 = bitcast %struct.boundptr* %ptr2 to { i8*, i64 }*
  %6 = getelementptr { i8*, i64 }, { i8*, i64 }* %5, i32 0, i32 0
  store i8** %ptr2.coerce0, i8** %6
  %7 = getelementptr { i8**, i64 }, { i8**, i64 }* %5, i32 0, i32 1
  store i64 %ptr2.coerce1, i64* %7
  %8 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %1, i32 0, i32 0
  %9 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %ptr1, i32 0, i32 0
  %10 = load i8*, i8** %9, align 8
  store i8* %10, i8** %8, align 8
  %11 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %1, i32 0, i32 1
  %12 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %ptr2, i32 0, i32 1
  %13 = load i64, i64* %12, align 8
  store i64 %13, i64* %11, align 8
  %14 = bitcast %struct.boundptr* %1 to { i8*, i64 }*
  %15 = load { i8*, i64 }, { i8*, i64 }* %14, align 8
  ret { i8*, i64 } %15
}