如何使用Django和Python 2更正unicode字符串的测试_Python_Django_Testing_Unicode

如何使用Django和Python 2更正unicode字符串的测试

python django testing unicode

如何使用Django和Python 2更正unicode字符串的测试,python,django,testing,unicode,Python,Django,Testing,Unicode,我需要测试Django模型的表示是否与Unicode一起工作，因为用户可能会在其中插入像ü或¼这样的字符串。为此，我有一个Django tests.py # -*- coding: utf-8 -*- from django.conf import settings from django.contrib.auth.models import User from django.core.urlresolvers import reverse from django.test import Tes

我需要测试Django模型的表示是否与Unicode一起工作，因为用户可能会在其中插入像ü或¼这样的字符串。为此，我有一个Django tests.py

# -*- coding: utf-8 -*-
from django.conf import settings
from django.contrib.auth.models import User
from django.core.urlresolvers import reverse
from django.test import TestCase
from django.utils import timezone

from .models import *
from .views import *

class CategoryTestCase(TestCase):
    """ Test to check whether category name is printed correctly.
        If there is a parent, it should be also printed seperated by a : """

    def setUp(self):
        self.cat1 = Category.objects.create(name=u'Category 1')
        self.cat2 = Category.objects.create(name=u'Category ü', parent=self.cat1)
        self.cat3 = Category.objects.create(name=u'Category 3', parent=self.cat2)

    def test_category_name(self):
        cat_result1 = u'Category 1'
        cat_result2 = u'Category 1' + settings.PARENT_DELIMITER + u'Category ü'
        cat_result3 = u'Category 1' + settings.PARENT_DELIMITER + u'Category ü' + settings.PARENT_DELIMITER + u'Category 3'
        self.assertEqual(self.cat1.__str__(), cat_result1)
        self.assertEqual(self.cat2.__str__(), cat_result2)
        self.assertEqual(self.cat3.__str__(), cat_result3)

这是为了测试这个小模型：

#...
from django.utils.encoding import python_2_unicode_compatible
#....
@python_2_unicode_compatible
class Category(models.Model):
    """ Representing a category a part might contains to.
    E.g. resistor """

    name = models.CharField(
        max_length=50,
        help_text=_("Name of the category.")
    )
    parent = models.ForeignKey(
        "self",
        null=True,
        blank=True,
        help_text=_("If having a subcateogry, the parent.")
    )
    description = models.TextField(
        _("Description"),
        blank=True,
        null=True,
        help_text=_("A chance to summarize usage of category.")
    )

    def __str__(self):
        if self.parent is None:
            return ('{}'.format(self.name))
        else:
            return ('%s%s%s' % (
                self.parent.__str__(),
                settings.PARENT_DELIMITER,
                self.name)
            )

    def get_parents(self):
        """ Returns a list with parants of that StoragePare incl itself"""
        result = []
        next = self
        while True:
            if next.id in result:
                raise(CircleDetectedException(
                    _('There seems to be a circle inside ancestors of %s.' % self.id)))
            else:
                result.append(next.id)
                if next.parent is not None:
                    next = next.parent
                else:
                    break
        return result

    def clean(self):
        pass

（稍微剥去一点）

当通过Python3和test运行此代码时，或者当Python2/3作为应用程序执行时，它正在工作。只有Python2的测试失败了，所以我认为我的想法有问题。根据错误消息，Unicode字符串似乎未正确编码和解码

======================================================================
FAIL: test_category_name (partsmanagement.tests.CategoryTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/frlan/quellen/partuniverse/partuniverse/partsmanagement/tests.py", line 31, in test_category_name
    self.assertEqual(self.cat2.__str__(), cat_result2)
AssertionError: 'Category 1->Category \xc3\xbc' != u'Category 1->Category \xfc'

所以我的问题是：如何使用Django进行正确的Unicode表示测试。

根据错误消息，测试尝试比较

str

和

Unicode

对象。一般来说，这不好

AssertionError: 'Category 1->Category \xc3\xbc' != u'Category 1->Category \xfc'

尝试：

比较unicode对象：

self.assertEqual（self.cat1，cat\u result1）

始终使用unicode语言环境，即使有时仅使用拉丁1符号

您试过使用吗

6.文本类型

用于表示（Unicode）文本数据的类型。这是Python2中的

unicode（）

，Python3中的

str

edit：您不必安装

six

，因为所需的方法在

django.utils.six

中提供-感谢@hynekcer指出这一点

Six提供了简单的实用程序，用于包装Python2和Python3之间的差异。它旨在支持在Python2和Python3上工作而无需修改的代码库。six只包含一个Python文件，因此复制到项目中是很轻松的~

tl；dr）在assertEqual的两侧使用相同的类型

最好的代码，对于Python3和Python2来说是通用的，而无需添加符号，如

u'，unicode，foo.\uuu str\uuu（）

等。应该编写更少的代码，更多地考虑Python2/3中预期的类型

第A部分）修复测试
如果您在右侧使用相同的显式类型
u'…'
，则一个简短（难看）的解决方案是在左侧使用文本类型。由于可读性更好，避免使用带下划线的函数。可以通过以下几种方式将值转换为文本类型：
将测试中的线路更换为

self.assertEqual(u'%s' % self.cat1, cat_result1)
或
或
更好的解决方案是统一模块中的类型，并在模块开始时使用
from\uuuuu future\uuuu导入unicode\u文本
，因为您主要处理文本，而不是二进制数据。您可以删除所有
u'
，但在两个版本中都正常工作之前，它仍然很有用
第B部分）修复
方法如果父类别名称和当前名称都不是ASCII格式，则代码将失败。修正： from __future__ import unicode_literals # ... many lines def __str__(self): if self.parent is None: return ('{}'.format(self.name)) else: return ('%s%s%s' % ( self.parent, settings.PARENT_DELIMITER, self.name) ) 我只删除了\uuu str\uu（）调用并添加了future指令，因为models.py是第一个特别有用的模块。否则，您应该在此处向两个格式字符串添加u' 了解装饰师的工作是很有用的。\uuu str\uu 方法的结果应该是text\u type （Python 2中的unicode），但如果在Python 2中直接调用它，则会得到字节类型。格式化运算符选择匹配的方法，但任何显式方法在Python3或Python2中都无效。不同类型的非ascii值不能组合。如果您使用的是python 2，您的模型应该有一个\uuuuuUnicode\uuuuu 方法，而不是\uuuu str\uuuUnicode>，并且该方法应该返回一个unicode对象。您可以在django代码中使用from\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuunicode\u文本，使其在python 2和python 3之间交叉兼容。装潢师在模型侧假装这样做。事实上，它似乎在应用程序运行时起作用。但是在使用Python2并检查_str _uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。为什么不到处使用unicode对象呢？我在之前的评论中链接到的文档对此进行了解释。如果您想编写同时适用于python 2和python 3的代码，您应该阅读整个页面。这可能是一个转储文件，但尝试应用文档会迫使我在python 2上创建内容相同的_str__（）和__unicode____;（），即使文档中说的是just create _u uncode___z（）。与上述错误相同。当我想测试真正的UTF时，使用Bytestring编码的UTF也是错误的。只要将UTF-8编码的字符串与真正的unicode进行比较，测试就不会通过。你可以解码'\xc3\xbc'.decode（'utf-8'）==u'\xfc' 1+，但django.utils.six 是django通常有用的六个函数中的一个简单子集。问题是type（self.cat1）是，相当于字节，而不是unicode。抱歉，在这种情况下不确定。我搬到了蟒蛇3号。有些unicode问题已经解决了。你是说从未来导入unicode文本而不是绝对导入。谢谢 from django.utils.six import text_type self.assertEqual(text_type(self.cat1), cat_result1) from __future__ import unicode_literals # ... many lines def __str__(self): if self.parent is None: return ('{}'.format(self.name)) else: return ('%s%s%s' % ( self.parent, settings.PARENT_DELIMITER, self.name) )