Python 2'；s.解码（“十六进制”）_Python

Python 2'；s.解码（“十六进制”）

python

Python 2'；s.解码（“十六进制”）,python,Python,我在理解Python2的foo.decode（“hex”）命令时遇到了一些问题。我在Python2.7.12中获得了以下内容（其中words_alpha.txt是一个4MB字典）在Python3中，foo.decode（“hex”）被弃用。但是将hexes=[x.decode（“hex”）替换为hexes=[binascii.unhexlify（x）.decode（）替换为x-in-hexes] UnicodeDecodeError:“utf-8”编解码器无法解码位置3中的字节0xe8:无效的

我在理解Python2的

foo.decode（“hex”）

命令时遇到了一些问题。我在Python2.7.12中获得了以下内容（其中words_alpha.txt是一个4MB字典）

在Python3中，

foo.decode（“hex”）

被弃用。但是将

hexes=[x.decode（“hex”）替换为hexes=[binascii.unhexlify（x）.decode（）替换为x-in-hexes]

UnicodeDecodeError:“utf-8”编解码器无法解码位置3中的字节0xe8:无效的连续字节
而hexes=[binascii.unexlify（x）.decode（“utf-8”，“忽略”）用于hexes中的x]
（或“替换”
，“反斜杠替换”
等）工作正常。那么什么是foo.decode（“hex”）
做的binascii.unexlify（foo.decode（）
默认情况下不做？
我认为问题在于.decode（“utf-8”，“忽略”）
-使用的“忽略”
参数在第一种情况下，您实际上忽略了引发UnicodeDecodeError
异常的问题。
binascii.hexlify

和

编解码器.decode>之间的差异：

binascii.hexlify
二进制数据的十六进制表示法
返回值是一个字节对象
类型：内置函数或方法
编解码器。解码
.decode（obj，编码='utf-8'，错误='strict'）
使用注册用于编码的编解码器解码obj。可能会给出错误以设置所需的错误处理方案。默认的错误处理程序是“严格”的，这意味着解码错误会引发ValueError（或更特定于编解码器的子类，如UnicodeDecodeError）。有关编解码器错误处理的更多信息，请参阅编解码器基类
类型：内置函数或方法

这是怎么回事？这不是问题，而是差异的基础。默认情况下，任何到Unicode的转换都是严格的。[binascii.unhexlify（x）.decode（）用于十六进制中的x]
一次解码一个字节。UTF8是一种多字节编码；这就是它给出错误的原因。[binascii.unhexlify（x）.decode（“utf-8”）给出了相同的错误。。。
words = open("words_alpha.txt").read().split('\n')
def xor(x, y):
    if len(x) == len(y):
        return "".join([chr(ord(x[i]) ^ ord(y[i])) for i in range(len(x))])

def single_char_xors(msg):
    for i in range(128):
        yield [chr(i), xor(msg, chr(i)*len(msg))]


def real_word_count(S): # Assumes there is at least one three-letter word in the string S.
    count = 0
        for word in filter(lambda s: s.isalpha() and len(s) >= 3, S.split(' ')):
            if word.lower() in words:
                count += 1
        return count

hexes = open("4.txt").read().split('\n')
hexes = [x.decode("hex") for x in hexes]
answer = []
maxwc = 0
for x in hexes:
    for y in single_char_xors(x):
        if real_word_count(y[1]) > maxwc:
            answer = [x] + y
            maxwc = real_word_count(y[1])

print answer[0] + " xor " + answer[1] + " is " + answer[2]