如何在Python中使用map来读取文件?
我有下面的Python代码,它解析来自目录中每个文件的URL,我尝试使用函数如何在Python中使用map来读取文件?,python,python-3.x,Python,Python 3.x,我有下面的Python代码,它解析来自目录中每个文件的URL,我尝试使用函数map来实现多线程处理 import glob, os import xmltodict import mysql.connector from multiprocessing import Pool def get_xml_paths(folder): return (os.path.join(folder, f) for f in os.listdir(folder)
map
来实现多线程处理
import glob, os
import xmltodict
import mysql.connector
from multiprocessing import Pool
def get_xml_paths(folder):
return (os.path.join(folder, f)
for f in os.listdir(folder)
if 'xml' in f)
def openXML(file):
global i
doc = xmltodict.parse(file.read())
for i in range(0, len(doc['urlset']['url'])):
if i > to:
break
## Validation
url = doc['urlset']['url'][i]['loc'];
if "books" in url:
c.execute("INSERT INTO apps (url) VALUES (%s)", [url])
conn.commit()
i = i + 1
if __name__ == '__main__':
files = get_xml_paths("unzip/")
pool = Pool()
pool.map(openXML, files)
pool.close()
pool.join()
c.close()
因此,当我运行此应用程序时,我会得到错误列表:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\O\AppData\Local\Programs\Python\Python35-32\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "C:\Users\O\AppData\Local\Programs\Python\Python35-32\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "C:\Users\O\PycharmProjects\Grabber\grabber.py", line 28, in openXML
doc = xmltodict.parse(file.read())
AttributeError: 'str' object has no attribute 'read'
我怎样才能解决这个问题?我看不出明显的原因。
openXML
中的file
是字符串而不是文件对象,因此字符串中没有read
-方法。您必须首先打开该文件:
import glob, os
import xmltodict
import mysql.connector
from multiprocessing import Pool
def open_xml(file):
with open(file) as xml:
doc = xmltodict.parse(xml.read())
cursor = conn.cursor()
for url in doc['urlset']['url']:
url = url['loc'];
if "books" in url:
cursor.execute("INSERT INTO apps (url) VALUES (%s)", [url])
conn.commit()
if __name__ == '__main__':
files = glob.glob("unzip/*.xml")
pool = Pool()
pool.map(open_xml, files)
在返回文件名时是否尝试过
open(file).read()
,而不是从get\u xml\u路径中返回文件对象?看起来您没有向openXML函数传递任何内容。你不应该有pool.map(openXML(文件),文件)
?我还注意到在openXML函数中没有返回语句。不确定这是否会导致任何问题。您可以用return替换break语句。@NoahChristopher语法正确。。。openXML作为池的第一个参数很好。map
是可调用的-它的参数取自其余参数…是的,我错过了参数文件
。现在我得到了错误:Traceback(最后一次调用):文件“C:/Users/O/PycharmProjects/Grabber/Grabber.py”,第47行,在pool.map(openXML(files),files)文件“C:/Users/O/PycharmProjects/Grabber/Grabber.py”,第28行,在openXML doc=xmltodict.parse(File.read())中AttributeError:“generator”对象没有属性“read”
如何将writeln
显示为日志解析数据?如何显示执行脚本的时间?如何查看现在处理的文件是什么?