Python 使用中文字符将scrapy管道转换为JSON
我正试图用汉字拼凑一些网页内容。内容如下所示Python 使用中文字符将scrapy管道转换为JSON,python,json,scrapy,Python,Json,Scrapy,我正试图用汉字拼凑一些网页内容。内容如下所示 2018-11-20 12:42:18 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cn.bing.com/dict/search?q=tool&FORM=BDVSP6&mkt=zh-cn> {'defBing': '工具;方法;受人利用的人', 'defWeb': '工具;方法;受人利用的人', 'pClass': 'n.', 'prUK': 'U
2018-11-20 12:42:18 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cn.bing.com/dict/search?q=tool&FORM=BDVSP6&mkt=zh-cn>
{'defBing': '工具;方法;受人利用的人',
'defWeb': '工具;方法;受人利用的人',
'pClass': 'n.',
'prUK': 'UK\xa0[tuːl]',
'prUS': 'US\xa0[tul]',
'word': 'tool'}
管道看起来像:
class JsonWriterPipeline(object):
def open_spider(self, spider):
self.file = open('log/DICT.%s.json' % time.strftime('%Y%m%d-%H%M%S', time.localtime()), 'tw')
def close_spider(self, spider):
self.file.close()
def process_item(self, item, spider):
try:
line = json.dumps(dict(item), indent=4) + "\n"
self.file.write(line)
except Exception as e:
print(e)
return item
我的问题是:我如何保持*.json文件中的汉字打印状态?我真的不想要那些编码的Unicode字符:)似乎json库会转义这些符号,请尝试将
确保ascii=False
添加到json.dumps()
中,如下所示:
class JsonWriterPipeline(object):
def open_spider(self, spider):
self.file = open('log/DICT.%s.json' % time.strftime('%Y%m%d-%H%M%S', time.localtime()), 'tw')
def close_spider(self, spider):
self.file.close()
def process_item(self, item, spider):
try:
line = json.dumps(dict(item), indent=4, ensure_ascii=False) + "\n"
self.file.write(line)
except Exception as e:
print(e)
return item
class JsonWriterPipeline(object):
def open_spider(self, spider):
self.file = open('log/DICT.%s.json' % time.strftime('%Y%m%d-%H%M%S', time.localtime()), 'tw')
def close_spider(self, spider):
self.file.close()
def process_item(self, item, spider):
try:
line = json.dumps(dict(item), indent=4, ensure_ascii=False) + "\n"
self.file.write(line)
except Exception as e:
print(e)
return item