python如何平等对待unicode和非unicode元组？_Python_Python 2.7_Unicode

python如何平等对待unicode和非unicode元组？

python python-2.7 unicode

python如何平等对待unicode和非unicode元组？,python,python-2.7,unicode,Python,Python 2.7,Unicode,我正在使用Python 2.7.11 我有两个元组： >>> t1 = (u'aaa', u'bbb') >>> t2 = ('aaa', 'bbb') 我试过这个： >>> t1==t2 True Python如何将unicode和非unicode等同对待？Python 2认为ByTestRing和unicode是相等的。顺便说一下，这与包含的元组无关。相反，它与隐式类型转换有关，我将在下面解释很难用“简单”的ascii码点来演示它，

我正在使用Python 2.7.11

我有两个元组：

>>> t1 = (u'aaa', u'bbb')
>>> t2 = ('aaa', 'bbb')

我试过这个：

>>> t1==t2
True

Python如何将unicode和非unicode等同对待？

Python 2认为ByTestRing和unicode是相等的。顺便说一下，这与包含的元组无关。相反，它与隐式类型转换有关，我将在下面解释

很难用“简单”的ascii码点来演示它，因此为了了解引擎盖下到底发生了什么，我们可以使用更高的码点来引发故障：

>>> bites = u'Ç'.encode('utf-8')
>>> unikode = u'Ç'
>>> print bites
Ç
>>> print unikode
Ç
>>> bites == unikode
/Users/wim/Library/Python/2.7/bin/ipython:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  #!/usr/bin/python
False

看到上面的unicode和字节比较后，python假定字节是用

sys.getdefaultencoding（）

（在我的平台上是“ascii”）编码的，从而隐式地尝试将bytestring解码为unicode对象

在我刚才展示的例子中，这失败了，因为字节是用“utf-8”编码的。现在，让我们让它“起作用”：

>>比特=u'ch'。编码（'ISO8859-1'）
>>>unikode=u'ch'
>>>导入系统
>>>重新加载（系统）#伙计们，请不要真的使用这个黑客
>>>sys.setdefaultencoding（'ISO8859-1'）
>>>咬合==unikode
真的

您的上转换“工作”方式基本相同，但使用“ascii”编解码器。字节和unicode之间的这种隐式转换实际上是非常有害的，并且会导致很多错误，因此决定停止在Python3中进行这种转换，因为“显式优于隐式”

有点离题，在Python3+上，您的代码实际上都表示unicode字符串文字，因此它们无论如何都是相等的。u前缀将被自动忽略。如果想在python3中使用bytestring文本，需要像

b'this'

那样指定它。然后，在进行比较之前，您可能希望1）显式解码字节，或2）显式编码unicode对象

非常感谢您的详细解释。

>>> bites = u'Ç'.encode('ISO8859-1')
>>> unikode = u'Ç'
>>> import sys
>>> reload(sys)   # please don't ever actually use this hack, guys 
<module 'sys' (built-in)>
>>> sys.setdefaultencoding('ISO8859-1')
>>> bites == unikode
True