Python 有没有更快的方法将任意大整数转换为字节的大端序列？_Python_Optimization

Python 有没有更快的方法将任意大整数转换为字节的大端序列？

python optimization

Python 有没有更快的方法将任意大整数转换为字节的大端序列？,python,optimization,Python,Optimization,我有以下Python代码来实现这一点： from struct import pack as _pack def packl(lnum, pad = 1): if lnum < 0: raise RangeError("Cannot use packl to convert a negative integer " "to a string.") count = 0 l = [] while

我有以下Python代码来实现这一点：

from struct import pack as _pack

def packl(lnum, pad = 1):
    if lnum < 0:
        raise RangeError("Cannot use packl to convert a negative integer "
                         "to a string.")
    count = 0
    l = []
    while lnum > 0:
        l.append(lnum & 0xffffffffffffffffL)
        count += 1
        lnum >>= 64
    if count <= 0:
        return '\0' * pad
    elif pad >= 8:
        lens = 8 * count % pad
        pad = ((lens != 0) and (pad - lens)) or 0
        l.append('>' + 'x' * pad + 'Q' * count)
        l.reverse()
        return _pack(*l)
    else:
        l.append('>' + 'Q' * count)
        l.reverse()
        s = _pack(*l).lstrip('\0')
        lens = len(s)
        if (lens % pad) != 0:
            return '\0' * (pad - lens % pad) + s
        else:
            return s

在Python3.2中，

int

类具有

to_bytes

和

from_bytes

函数，这些函数可以比上面给出的方法更快地完成这项任务。

我想你真的应该使用numpy，我确信它内置了某种东西。使用

array

模块也可能更快。不过我还是要试试看

IMX，创建一个生成器并使用列表理解和/或内置求和比附加到列表的循环要快，因为附加可以在内部完成。哦，而且一根大绳子上的“lstrip”一定很昂贵

此外，还有一些风格要点：特殊情况不够特殊；而且您似乎没有收到关于新的

x if y else z

constructure.：）尽管我们无论如何都不需要它

从结构导入包作为\u包
Q_大小=64
Q_位掩码=（1L>=Q_大小
def pack_long_big_endian（a_long，pad=1）：
如果lnum<0：
raise RANGE ERROR（“无法使用packl转换负整数”
“到字符串。”）
qs=列表（反向（四元（a长）））
#把第一个分开包装，这样我们可以很好地撕开。
first=_pack（'>Q'，qs[0]）.lstrip（'\x00'）
rest=_pack（'>%sQ'%len（qs）-1，*qs[1:]
计数=长（第一）+长（剩余）
#一个小数学技巧，取决于Python的模行为
#对于负数-但它有明确的定义和记录
返回“\x00”*（-count%pad）+first+rest

这里有一个通过

ctypes

调用Python/CAPI的解决方案。目前，它使用NumPy，但如果NumPy不是一个选项，则可以使用

ctypes

来完成

import numpy
import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
                               numpy.ctypeslib.ndpointer(numpy.uint8),
                               ctypes.c_size_t,
                               ctypes.c_int,
                               ctypes.c_int]

def packl_ctypes_numpy(lnum):
    a = numpy.zeros(lnum.bit_length()//8 + 1, dtype=numpy.uint8)
    PyLong_AsByteArray(lnum, a, a.size, 0, 1)
    return a

在我的机器上，这比你的方法快15倍

编辑：以下是仅使用

ctypes

并返回字符串而不是NumPy数组的相同代码：

import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
                               ctypes.c_char_p,
                               ctypes.c_size_t,
                               ctypes.c_int,
                               ctypes.c_int]

def packl_ctypes(lnum):
    a = ctypes.create_string_buffer(lnum.bit_length()//8 + 1)
    PyLong_AsByteArray(lnum, a, len(a), 0, 1)
    return a.raw

这是另外两倍的速度，在我的机器上总共是30倍的加速系数。

只是想对Sven的答案做一个跟进（效果很好）。相反的操作——从任意长的字节对象到Python整数对象需要以下操作（因为没有PyLong_FromByteArray（）C API函数，我可以找到）：

为了完整性和本问题的未来读者：

从Python 3.2开始，有一些函数和可以在字节顺序的选择中执行

bytes

和

int

对象之间的转换。

什么是

pad

呢？docstring很容易理解用法。@Scott据我所知，输出在pad num的下一个倍数的前面是零填充的由于是本地使用的变量，您应该避免使用诸如“l”之类的变量名，它看起来太像“1”在大多数字体上保持可读性。@Karl Knechtel-完全正确。我想在我想将其转储到64位或128位或类似长度的插槽中的情况下使用它。@Scott Griffiths-没错，它确实需要一个docstring。但这不会使用系统本机端吗@卡尔：不，不会。PyLong_AsByteArray（）的第四个参数

表示使用哪种endianness:

表示big-endian，其他任何东西都表示little-endian.Awesome。现在我希望这是直接公开的…：/这个API在不同版本的Python上有很大变化吗？

int（binascii.hexlify（stringbytes），16）

比ctypes.pythonapi.\u PyLong\u from bytearray快。谁会这样做？我不应该投票支持你。你的代码有很多错误。事实上，有一个

\u PyLong\u from bytearray

函数（至少在Python 2.7和Python 3中）。我正在使用它。但是你的方法可能也会非常快。事实上，这比使用ctypes从ByteArray调用_PyLong_要快。这太奇怪了。更妙的是，我不必检查输入是否是

memoryview

，因为hexlify处理这些，我不必在Python 2.7中转换为

int

，以使如果它足够小，不需要长的话，就可以直接使用

int

。另外，使用

hex（lnum）”和binascii.unhexlify`（加一点额外的胶水）也比ctypes选项快。奇怪。我查看了Python 3.1.x C API参考，在Bytearray（）中找不到PyLong_。谢谢！我想知道endian标志是否会减慢速度。我们拭目以待。即使使用endian标志，它仍然需要我迄今为止找到的最快方法的1/3（或更少）时间。
import numpy
import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
                               numpy.ctypeslib.ndpointer(numpy.uint8),
                               ctypes.c_size_t,
                               ctypes.c_int,
                               ctypes.c_int]

def packl_ctypes_numpy(lnum):
    a = numpy.zeros(lnum.bit_length()//8 + 1, dtype=numpy.uint8)
    PyLong_AsByteArray(lnum, a, a.size, 0, 1)
    return a

import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
                               ctypes.c_char_p,
                               ctypes.c_size_t,
                               ctypes.c_int,
                               ctypes.c_int]

def packl_ctypes(lnum):
    a = ctypes.create_string_buffer(lnum.bit_length()//8 + 1)
    PyLong_AsByteArray(lnum, a, len(a), 0, 1)
    return a.raw

import binascii

def unpack_bytes(stringbytes):
    #binascii.hexlify will be obsolete in python3 soon
    #They will add a .tohex() method to bytes class
    #Issue 3532 bugs.python.org
    return int(binascii.hexlify(stringbytes), 16)