Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 从文件中删除非Unicode字符_Python 2.7_Ascii_Non Ascii Characters_Python Unicode_Non Unicode - Fatal编程技术网

Python 2.7 从文件中删除非Unicode字符

Python 2.7 从文件中删除非Unicode字符,python-2.7,ascii,non-ascii-characters,python-unicode,non-unicode,Python 2.7,Ascii,Non Ascii Characters,Python Unicode,Non Unicode,我知道这是一个重复的问题,但到目前为止,我真的努力尝试了所有的解决方案。有人能帮我从文件中删除像\xc3\xa2\xc2\x84\xc2\xa2这样的字符吗 我当前尝试清理的文件内容是: b‘烤洋葱酱’,“b”['2磅大黄洋葱,切成薄片','3大葱,切成薄片','4小枝百里香','1/4杯橄榄油','洁食盐和新鲜磨碎的黑胡椒','1杯白葡萄酒','2汤匙香槟醋','2杯酸奶油','1/2杯切碎的新鲜韭菜','1/4杯纯希腊酸奶','所有调味品和装饰用的百里香'“,”鳕鱼角Waves\xc3\xa

我知道这是一个重复的问题,但到目前为止,我真的努力尝试了所有的解决方案。有人能帮我从文件中删除像\xc3\xa2\xc2\x84\xc2\xa2这样的字符吗

我当前尝试清理的文件内容是: b‘烤洋葱酱’,“b”['2磅大黄洋葱,切成薄片','3大葱,切成薄片','4小枝百里香','1/4杯橄榄油','洁食盐和新鲜磨碎的黑胡椒','1杯白葡萄酒','2汤匙香槟醋','2杯酸奶油','1/2杯切碎的新鲜韭菜','1/4杯纯希腊酸奶','所有调味品和装饰用的百里香'“,”鳕鱼角Waves\xc3\xa2\xc2\x84\xc2\xa2薯片供应“]””


我尝试过使用re.sub(“[^\x00-\x7F]+”,“”,whatevertext),但似乎没有任何效果。我怀疑\here没有被视为特殊字符。

您可以这样做:

>>> f = open("test.txt","r")
>>> whatevertext = f.read()
>>> print whatevertext
b'Roasted Onion Dip',"b""['2 pounds large yellow onions, thinly sliced', '3 large shallots, thinly sliced', '4 sprigs thyme', '1/4 cup olive oil', 'Kosher salt and freshly ground black pepper', '1 cup white wine', '2 tablespoons champagne vinegar', '2 cups sour cream', '1/2 cup chopped fresh chives', '1/4 cup plain Greek yogurt', 'Everything seasoning and thyme to garnish', 'Cape Cod Waves\xc3\xa2\xc2\x84\xc2\xa2 Potato Chips for serving']"""

>>> import re
>>> result = re.sub('\\\\x[a-f|0-9]+','',whatevertext)
>>> print result
b'Roasted Onion Dip',"b""['2 pounds large yellow onions, thinly sliced', '3 large shallots, thinly sliced', '4 sprigs thyme', '1/4 cup olive oil', 'Kosher salt and freshly ground black pepper', '1 cup white wine', '2 tablespoons champagne vinegar', '2 cups sour cream', '1/2 cup chopped fresh chives', '1/4 cup plain Greek yogurt', 'Everything seasoning and thyme to garnish', 'Cape Cod Waves Potato Chips for serving']"""

>>> 

“\\x[a-f | 0-9]+”在这个正则表达式中,每个斜杠都用斜杠转义,在x之后,我们知道可以是0-9中的数字,也可以是a-f中的字母。

Super!非常感谢!