用于jit代码的python mmap_Python_Python 3.3_Jit_Mmap

用于jit代码的python mmap

python

用于jit代码的python mmap,python,python-3.3,jit,mmap,Python,Python 3.3,Jit,Mmap,我试图模仿本教程（）编写一个简单的jit。我不确定mmap的python接口是否支持以下用例。C代码（以防链接失效）如下所示 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/mman.h> int main(int argc, char *argv[]) { // Machine code for: // mov eax, 0 //

我试图模仿本教程（）编写一个简单的jit。我不确定mmap的python接口是否支持以下用例。C代码（以防链接失效）如下所示

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

int main(int argc, char *argv[]) {
  // Machine code for:
  //   mov eax, 0
  //   ret
  unsigned char code[] = {0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3};

  if (argc < 2) {
    fprintf(stderr, "Usage: jit1 <integer>\n");
    return 1;
  }

  // Overwrite immediate value "0" in the instruction
  // with the user's value.  This will make our code:
  //   mov eax, <user's value>
  //   ret
  int num = atoi(argv[1]);
  memcpy(&code[1], &num, 4);

  // Allocate writable/executable memory.
  // Note: real programs should not map memory both writable
  // and executable because it is a security risk.
  void *mem = mmap(NULL, sizeof(code), PROT_WRITE | PROT_EXEC,
                   MAP_ANON | MAP_PRIVATE, -1, 0);
  memcpy(mem, code, sizeof(code));

  // The function will return the user's value.
  int (*func)() = mem;
  return func();
}

code = [0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3]

import mmap
import ctypes

size_in_bytes = len(code) * 4
mem = mmap.mmap(-1, size_in_bytes, prot=mmap.PROT_WRITE | mmap.PROT_EXEC, flags= mmap.MAP_ANON | mmap.MAP_PRIVATE)
# mmap.mmap.move(mem, ctypes.addressof(code), size_in_bytes)

mem.write(ctypes.addressof(code), size_in_bytes)
ftype = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_void_p)

f = ftype(mem)
f()

但是，运行此命令会产生错误

Traceback (most recent call last):
  File "main.py", line 10, in <module>
    mem.write(ctypes.addressof(code), size_in_bytes)
TypeError: invalid type

回溯（最近一次呼叫最后一次）：
文件“main.py”，第10行，在
mem.write（ctypes.addressof（代码），大小（以字节为单位）
TypeError:无效的类型

所以我的问题是，我们如何获得可写的mmap页面，以及如何复制数据以使其能够进行JIT。如果这不能从python直接访问，我可以使用PythonC接口使用底层c实现来实现吗

我看到的大多数jit接口都使用llvm或其他一些底层jit。但我不太明白派比是怎么做到的。有什么想法吗？

您需要将python列表转换为ctype数组：

arr = (ctypes.c_int * len(code))(*code)

那么方法的作用是：

>>ctypes.addressof(arr)
47651024

您需要将python列表转换为ctype数组：

arr = (ctypes.c_int * len(code))(*code)

那么方法的作用是：

>>ctypes.addressof(arr)
47651024

在函数被调用的意义上，我得到了以下“工作”

from __future__ import print_function
import ctypes

code = bytes([0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3])
arr = ctypes.create_string_buffer(code)

ftype = ctypes.CFUNCTYPE(ctypes.c_int)
f = ftype(ctypes.addressof(arr))
print("Ready to call f()!")
f()

不幸的是，在我的操作系统（FreeBSD）上，这会导致分段错误，因为数据段不可执行

因此，我修改了代码以合并

mmap

from __future__ import print_function
import ctypes
import mmap

code = bytes([0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3])
ftype = ctypes.CFUNCTYPE(ctypes.c_int)

mem = mmap.mmap(-1, len(code), prot=mmap.PROT_WRITE | mmap.PROT_EXEC,
                flags=mmap.MAP_ANON | mmap.MAP_PRIVATE)
mem.write(code)
arr = ctypes.create_string_buffer(mem)

f = ftype(ctypes.addressof(arr))
print("Ready to call f()!")
f()

但这会产生一个类型错误：

Traceback (most recent call last):
  File "jit.py", line 11, in <module>
    arr = ctypes.create_string_buffer(mem)
  File "/usr/local/lib/python2.7/ctypes/__init__.py", line 68, in create_string_buffer
    raise TypeError(init)
TypeError: <mmap.mmap object at 0x8007cb928>

回溯（最近一次呼叫最后一次）：
文件“jit.py”，第11行，在
arr=ctypes。创建字符串缓冲区（mem）
文件“/usr/local/lib/python2.7/ctypes/_init__.py”，第68行，在创建字符串缓冲区中
raise TypeError（初始化）
类型错误：

编辑：查看

ctypes/\uuuu init\uuuuuuuuuuuuuuuuuuuuuuuupy

中的

create\u string\u buffer

代码，它只接受

str

、

int

或

long

模拟

create\u string\u buffer

的功能可能会有所帮助，但我现在没有时间尝试

Edit2:Python的

mmap

（）似乎忽略了PROT_EXEC标志。这可以解释segfaults。

从调用函数的意义上说，我得到了以下“工作”

from __future__ import print_function
import ctypes

code = bytes([0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3])
arr = ctypes.create_string_buffer(code)

ftype = ctypes.CFUNCTYPE(ctypes.c_int)
f = ftype(ctypes.addressof(arr))
print("Ready to call f()!")
f()

不幸的是，在我的操作系统（FreeBSD）上，这会导致分段错误，因为数据段不可执行

因此，我修改了代码以合并

mmap

from __future__ import print_function
import ctypes
import mmap

code = bytes([0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3])
ftype = ctypes.CFUNCTYPE(ctypes.c_int)

mem = mmap.mmap(-1, len(code), prot=mmap.PROT_WRITE | mmap.PROT_EXEC,
                flags=mmap.MAP_ANON | mmap.MAP_PRIVATE)
mem.write(code)
arr = ctypes.create_string_buffer(mem)

f = ftype(ctypes.addressof(arr))
print("Ready to call f()!")
f()

但这会产生一个类型错误：

Traceback (most recent call last):
  File "jit.py", line 11, in <module>
    arr = ctypes.create_string_buffer(mem)
  File "/usr/local/lib/python2.7/ctypes/__init__.py", line 68, in create_string_buffer
    raise TypeError(init)
TypeError: <mmap.mmap object at 0x8007cb928>

回溯（最近一次呼叫最后一次）：
文件“jit.py”，第11行，在
arr=ctypes。创建字符串缓冲区（mem）
文件“/usr/local/lib/python2.7/ctypes/_init__.py”，第68行，在创建字符串缓冲区中
raise TypeError（初始化）
类型错误：

编辑：查看

ctypes/\uuuu init\uuuuuuuuuuuuuuuuuuuuuuuupy

中的

create\u string\u buffer

代码，它只接受

str

、

int

或

long

模拟

create\u string\u buffer

的功能可能会有所帮助，但我现在没有时间尝试

Edit2:Python的

mmap

（）似乎忽略了PROT_EXEC标志。这就可以解释这些故障。

谢谢罗夫。在添加您的代码后，我尝试使用此代码。mmap.mmap.move（mem，ctypes.addressof（arr），size_in_bytes）我得到了错误-->回溯（最近一次调用）：文件“main.py”，第9行，在mmap.mmap.move（mem，ctypes.addressof（arr），size_in_bytes）TypeError:move（）正好有3个参数（给定2个）@ssarangi，我也发现了这一点。这是可行的：

mem.write（arr）

但我不确定这是否是您当时想要的…mem.write（arr）是可行的，我的下一个问题是分配给函数指针（ftype）的mem对象被分配为C代码中的类型。另一件事是mem.write如何计算出要写入的字节数？谢谢。在添加您的代码后，我尝试使用此代码。mmap.mmap.move（mem，ctypes.addressof（arr），size_in_bytes）我得到了错误-->回溯（最近一次调用）：文件“main.py”，第9行，在mmap.mmap.move（mem，ctypes.addressof（arr），size_in_bytes）TypeError:move（）正好有3个参数（给定2个）@ssarangi，我也发现了这一点。这是可行的：

mem.write（arr）

但我不确定这是否是您当时想要的…mem.write（arr）是可行的，我的下一个问题是分配给函数指针（ftype）的mem对象被分配为C代码中的类型。另一件事是mem.write如何计算出要写入多少字节？我尝试了这两种方法，并在OSX上看到了相同的问题。我已经研究了所有3个jit的方法。PyPy就是同一个，尽管我没有深入研究代码，但我无法理解它们实际上是如何实现jit的。Numba本质上是在内部使用llvm抖动（llvmlite）项目，我试图避免这种情况。Piston实现C++中的所有内容，我已经看到了它们在我使用LLVM时所熟悉的方法。然而，我想要一种直接用python的mmap编写抖动的简单方法，而不是为它制作一个c包装器。@ssarangi我找到了

create_string_buffer

失败的原因，请参见编辑后的答案。也许模拟一下会有帮助我试着这么做。。mem.seek（0）arr=ctypes.create_string_buffer（mem.read（len（code））。。所以在mem.write（代码）之后，我们寻找mem的开头，然后将其传递给读取len（代码）字节。然而，我仍然在OSX上得到相同的总线错误10。与上一个示例类似，您在该示例中说数据段不可执行。那么，使用python的mmap返回可执行内存是非法的吗？@ssarangi查看3.4.3的

mmap

文档时，根本没有提到PROT_EXEC。在for mmap中也没有使用它。我有一种隐秘的感觉，那就是它忽略了PROT_EXEC。这当然可以解释这些错误。这也是我的直觉：）。目前，我使用pythoncapi解决了这个问题，只是将字节流发送到cpapi。这似乎有效，但它打开了一个全新的蠕虫罐：）。例如，类型系统（函数签名）。现在是