Python 仅当从PC访问时，谷歌应用程序引擎上的Flask Webapp错误：'；ascii'；编解码器可以'；t解码字节_Python_Google App Engine_Encoding_Utf 8

Python 仅当从PC访问时，谷歌应用程序引擎上的Flask Webapp错误：'；ascii'；编解码器可以'；t解码字节

python google-app-engine encoding utf-8

Python 仅当从PC访问时，谷歌应用程序引擎上的Flask Webapp错误：'；ascii'；编解码器可以'；t解码字节,python,google-app-engine,encoding,utf-8,Python,Google App Engine,Encoding,Utf 8,我在Google App Engine上有一个Flask webapp，它要求用户上传一个文件。这几年来一直运作良好。webapp支持广告，因此我不会链接到托管版本，但源代码如下：一位用户最近通知我，他在一个文件中遇到了一个500错误：“ascii”编解码器无法解码第108位的字节0xc2：序号不在范围（128）他通过电子邮件向我发送了该文件，我无法在本地或在OSX上托管的webapp上复制该错误后来，他又给我发了几个导致错误的文件，所以我再试了一次，但这次是从电脑上发来的。在电脑上，我确

我在Google App Engine上有一个Flask webapp，它要求用户上传一个文件。这几年来一直运作良好。webapp支持广告，因此我不会链接到托管版本，但源代码如下：

一位用户最近通知我，他在一个文件中遇到了一个500错误：

“ascii”编解码器无法解码第108位的字节0xc2：序号不在范围（128）

他通过电子邮件向我发送了该文件，我无法在本地或在OSX上托管的webapp上复制该错误

后来，他又给我发了几个导致错误的文件，所以我再试了一次，但这次是从电脑上发来的。在电脑上，我确实收到了错误。出于好奇，我回到我的Mac电脑上，从我的Gmail上下载了相同的文件，并尝试了——但没有得到错误

为什么会这样？我真的很想在我的Mac电脑上重现这个错误，这样我就可以在家里调试了，但我只能从工作时的PC机上得到它——在那里我没有代码也无法调试

从Gmail下载后，但在上传到webapp之前，我认为这可能与本地文件编码有关，所以在我的Mac电脑上，我打开了TextWrangler，试图将编码更改为ascii。仍然没有错误
在记事本中打开PC上的文件，并将编码更改为UTF8。仍然会导致错误

添加了来自未来导入unicode文本到webapp的

。仍然通过OSX上的所有测试，仍然在PC上导致类似错误（“ascii”编解码器无法解码位置0:序号不在范围（128）
）中的字节0xef）


为什么同一个webapp和同一个上传的文件在PC上有错误，而在我的Mac上没有？GAE是否会在检测客户端操作系统的基础上以某种方式更改webapp版本
非常感谢你的帮助

Windows 7 v6.1上的Chrome 46.0.2490.80
操作系统X 10.11.1上的Chrome 46.0.2490.80
托管在GAE上的Python 2.7
烧瓶==0.10.1

更新20151111
能够在GAE上找到堆栈跟踪：
Exception on / [POST]
Traceback (most recent call last):
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/icw/views.py", line 38, in index
    links_title=links_title)
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask/templating.py", line 128, in render_template
    context, ctx.app)
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask/templating.py", line 110, in _render
    rv = template.render(context)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/jinja2-2.6/jinja2/environment.py", line 894, in render
    return self.environment.handle_exception(exc_info, True)
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/icw/templates/index.html", line 1, in top-level template code
    {% extends "base.html" %}
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/icw/templates/base.html", line 3, in top-level template code
    {% extends "bootstrap/base.html" %}
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask_bootstrap/templates/bootstrap/base.html", line 1, in top-level template code
    {% block doc -%}
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask_bootstrap/templates/bootstrap/base.html", line 4, in block "doc"
    {%- block html %}
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask_bootstrap/templates/bootstrap/base.html", line 20, in block "html"
    {% block body -%}
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/icw/templates/base.html", line 40, in block "body"
    {{ utils.flashed_messages(messages=messages, container=False) }}
  File "/base/data/home/apps/s~icw-flask/2.386023698597365904/lib/flask_bootstrap/templates/bootstrap/utils.html", line 12, in template
    {% for cat, msg in messages %}      <div class="alert alert-{{cat}}" role="alert">{{msg|safe}}</div>{% endfor -%}
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/jinja2-2.6/jinja2/filters.py", line 705, in do_mark_safe
    return Markup(value)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/markupsafe-0.15/markupsafe/__init__.py", line 71, in __new__
    return unicode.__new__(cls, base)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 108: ordinal not in range(128)

最终的解决方案并不是特别简单。我仍然不能完全确定为什么在我的Mac电脑上一切正常，但在PC机上却不行
然而，我庆幸地发现了InternetExplorer，它让我可以在Macbook上轻松地进行测试。它们的下载空间只有几GB，但后来我确认我在PC映像上看到了IE的错误，但在FireFox、Safari或OS X上的Chrome中没有
这看起来像是一个unicode/ascii问题，所以我想尝试将所有内容转换为unicode将是一个解决方案。最后，我的代码中有几个特定的部分需要注意

首先是读取包含unicode字符的文件，意识到我需要upfile.read（）.decode（'utf8'）
或unicode（upfile.read（），'utf8'）
来使用unicode而不是str
。（显然unicode（）
更快。）
接下来要记住的是，python2 csv模块存在unicode问题，需要一种对unicode友好的解决方法
接下来我要记住，我现在需要将所有字符串转换为unicode来处理读入的数据，例如将print（“foo:{}.format（bar））
更新为print（u“foo:{}.format（unicode_bar））
还有一些地方，我正在使用map（str，myset）
的内容打印集合，我将其更改为map（unicode，myset）
最后一件事是找出一个带有字节顺序标记的错误

更详细地说，我首先阅读了csv
模块的unicode问题，并将我的csv.reader
转换为unicode友好版本
接下来，我继续添加了-*-coding:utf-8
到我的文件的顶部，就在该的下方，从uuu future\uuuu导入unicode\u文本
，以避免手动将每个'example string'
更改为u'example string
NB:读者在使用unicode_文字
之前，应了解难以破解错误的风险；您最好手动更改所有字符串
然而，即使在这之后，我仍然会遇到unicode错误——尽管略有不同，但在文件的开头始终是一致的，尤其是在位置0

处的

u'\ufeff'时
在这个问题上有几个SO线程，但基本上，这个字符是一个“字节顺序标记”（BOM），PC经常在文件开头添加（特别是用记事本编辑）以指示它是utf-8编码的我想这就是为什么我只在电脑上出现这个问题。为了解决这个问题，我将unicode\u阅读器
改为使用utf-8-sig
编码
我的最终代码如下所示：
def unicode_csv_reader(utf8_file, **kwargs):
    # splitlines lets us respect universal newlines
    utf8_data = utf8_file.read().splitlines()
    csv_reader = csv.reader(utf8_data, **kwargs)
    for row in csv_reader:
        yield [unicode(cell, 'utf-8-sig') for cell in row]

...

def convert(upfile):
    reader_builder = unicode_csv_reader(upfile, skipinitialspace=True)

    reader_list = list(reader_builder)

我可以尝试使用utf-8-sig而不是使用utf-8-sig
，但至少我有一个正常工作的版本，它似乎通过了所有测试，并在OS X和PC/IE虚拟机中按预期工作
希望这对其他人有帮助
更新20151115:

似乎BOM确实是个问题，当我在PC上的记事本中对文件进行简短编辑时，它可能被插入了。我发现我可以从上面使用虚拟机，从Gmail下载文件，在记事本中打开（包括在虚拟机上）并保存，然后使用Dropbox或诸如此类的东西传输回OS X，这样我就可以在OSX上复制这个bug，所以它与操作系统无关，可能只是BOM
您应该发布自己项目中的完整堆栈跟踪和相关代码。这可能与输出一个utf-8字符串有关，但是在被解释为ascii的上下文中。谢谢您的评论。这是问题的一部分--我不能在我的本地环境中复制这个问题
def unicode_csv_reader(utf8_file, **kwargs):
    # splitlines lets us respect universal newlines
    utf8_data = utf8_file.read().splitlines()
    csv_reader = csv.reader(utf8_data, **kwargs)
    for row in csv_reader:
        yield [unicode(cell, 'utf-8-sig') for cell in row]

...

def convert(upfile):
    reader_builder = unicode_csv_reader(upfile, skipinitialspace=True)

    reader_list = list(reader_builder)