python阅读文本：'；re-->鈥檙；e_Python_Character Encoding

python阅读文本：'；re-->鈥檙；e

python character-encoding

python阅读文本：'；re-->鈥檙；e,python,character-encoding,Python,Character Encoding,我正在阅读一个包含以下句子的文本文件： “所以无论你说的是沃尔玛或者宜家或者Zara，你真的很有兴趣保持成本低，使过程非常稳定效率高。” 我的代码： files = "*.txt" for pathname in glob.glob(files): with open(pathname,'r') as singlefile: data = "".join(singlefile.readlines()) data = re.sub(r"(?<=

我正在阅读一个包含以下句子的文本文件：

“所以无论你说的是沃尔玛或者宜家或者Zara，你真的很有兴趣保持成本低，使过程非常稳定效率高。”

我的代码：

files = "*.txt"
for pathname in glob.glob(files):
    with open(pathname,'r') as singlefile:
        data = "".join(singlefile.readlines())
        data = re.sub(r"(?<=\w)\n", " ", data)
        data = re.sub(r",\n", ", ", data)
        print data

files=“*.txt”
对于glob.glob（文件）中的路径名：
将open（路径名，'r'）作为单个文件：
data=”“.join（singlefile.readlines（））
data=re.sub（r“（？如果编码正确（这也是一个很好的想法，他们还描述了一个编码猜测列表），它工作得很好。我已经尝试过：
import re

with open("words.txt",'r') as singlefile:
    data = "".join(singlefile.readlines())
    data = re.sub(r"(?<=\w)\n", " ", data)
    data = re.sub(r",\n", ", ", data)
    print data

这是输出：
>>> runfile('E:/programmierung/python/spielwiese/test.py', wdir=r'E:/programmierung/python/spielwiese')
So whether you’re talking about a Walmart or an IKEA or a Zara, you are really interested in keeping the cost low, keeping the process very efficient.
>>> 

请查看编码。似乎无法识别。您需要使用保存为的编码读取文件。我如何知道其编码？您可以从提供文件的人那里找到，也可以猜测。P.S.如果您打印repr（数据），这样我们就可以看到确切的字节，这会有所帮助。
>>> runfile('E:/programmierung/python/spielwiese/test.py', wdir=r'E:/programmierung/python/spielwiese')
So whether you’re talking about a Walmart or an IKEA or a Zara, you are really interested in keeping the cost low, keeping the process very efficient.
>>>