Python 如何将文本文件的信息放入元组中以转换为SQL表?
我有几个包含特定信息的文本文件。我需要从这些文件中提取所需信息,并将它们放在MySQL表中。这些文件包含几行信息,但我只需要这三行,例如:Python 如何将文本文件的信息放入元组中以转换为SQL表?,python,mysql,Python,Mysql,我有几个包含特定信息的文本文件。我需要从这些文件中提取所需信息,并将它们放在MySQL表中。这些文件包含几行信息,但我只需要这三行,例如: Name: Gorge Registration ID: 6657 Registration Time: 2012-09-10 14:31:13 我写了下面的代码,但是代码的结果不是我想要的。代码仍然不包含insert for SQL部分 import fnmatch import os import pprint matches=[]
Name: Gorge
Registration ID: 6657
Registration Time: 2012-09-10 14:31:13
我写了下面的代码,但是代码的结果不是我想要的。代码仍然不包含insert for SQL部分
import fnmatch
import os
import pprint
matches=[]
b=[]
for root, dirnames, filenames in os.walk('d:/Data'):
for filename in fnmatch.filter(filenames, 'Info_reg.txt'):
matches.append(os.path.join(root, filename))
all_keys = ['name','Registration ID','registration time']
for m in matches:
f=open(m,'r')
for line in f:
for n in all_keys:
if line.startswith(n):
a = line.split(':',1)
b.append(a)
代码结果如下所示,我假设无法轻松转换为表:
['registration time', ' 2012-10-08 17:28:47\n'],
['Registration ID', ' 9876'],
['Name', ' Malcom\n'],
['registration time', ' 2012-10-08 17:28:47\n'],
['Registration ID', ' 45'],
['Name', 'mazu\n'],
有人知道如何更改代码以使此文件成为一个漂亮的表格吗?您想对结果调用
.strip()
,并将整个内容存储在字典中而不是列表中
我们还可以优化行搜索和记录处理;我在这里假设,当我们发现一个名称
条目时,一个新记录已经开始:
records = []
all_keys = {'Name', 'Registration ID', 'registration time'}
first_key = 'Name'
for m in matches:
with open(m, 'r') as f
record = dict.fromkeys(all_keys) # new record dictionary with `None` values
for line in f:
key, value = line.split(':', 1)
key, value = key.strip(), value.strip()
if key not in all_keys:
continue # not interested in this line
if key == first_key and any(v for v in record.itervalues()):
# new record, finalize the previous
records.append(record)
record = dict.fromkeys(all_keys)
record[key] = value
if any(v for v in record.itervalues()):
# there is something in the last record still, add that too
records.append(record)
现在您有了表单的记录列表:
records = [
{'registration time', '2012-10-08 17:28:47', 'Registration ID': '9876', 'Name', 'Malcom'},
{'registration time', '2012-10-08 17:28:47', 'Registration ID': '45', 'Name', 'mazu'},
]
可以使用.executemany()
一次性将这些文件插入MySQLdb数据库:
cursor = conn.cursor()
cursor.executemany('INSERT INTO sometable (id, name, time) VALUES (%(Registration ID)s, %(Name)s, %(registration time)s)',
records)
conn.commit()
这会将所有收集的记录直接插入数据库。您希望对结果调用
.strip()
,并将整个内容存储在字典中而不是列表中
我们还可以优化行搜索和记录处理;我在这里假设,当我们发现一个名称
条目时,一个新记录已经开始:
records = []
all_keys = {'Name', 'Registration ID', 'registration time'}
first_key = 'Name'
for m in matches:
with open(m, 'r') as f
record = dict.fromkeys(all_keys) # new record dictionary with `None` values
for line in f:
key, value = line.split(':', 1)
key, value = key.strip(), value.strip()
if key not in all_keys:
continue # not interested in this line
if key == first_key and any(v for v in record.itervalues()):
# new record, finalize the previous
records.append(record)
record = dict.fromkeys(all_keys)
record[key] = value
if any(v for v in record.itervalues()):
# there is something in the last record still, add that too
records.append(record)
现在您有了表单的记录列表:
records = [
{'registration time', '2012-10-08 17:28:47', 'Registration ID': '9876', 'Name', 'Malcom'},
{'registration time', '2012-10-08 17:28:47', 'Registration ID': '45', 'Name', 'mazu'},
]
可以使用.executemany()
一次性将这些文件插入MySQLdb数据库:
cursor = conn.cursor()
cursor.executemany('INSERT INTO sometable (id, name, time) VALUES (%(Registration ID)s, %(Name)s, %(registration time)s)',
records)
conn.commit()
这会将所有收集的记录直接插入数据库。这可能会激发一个解决方案:
data = '''\
Name: Gorge
Registration ID: 6657
Registration Time: 2012-09-10 14:31:13
Somethign else: foo
Spam: Bar
Name: mazu
Registration ID: 45
Registration Time: 2012-10-08 17:28:47
Somethign else: foo
Spam: Bar'''.split('\n')
records = []
titles = ['Name','Registration ID','Registration Time']
def record_is_complete(rec):
return (rec.get('Name')
and rec.get('Registration ID')
and rec.get('Registration Time'))
def make_tuple(rec):
result = [ rec[key] for key in titles ]
return tuple(result)
record = {}
for line in data:
key, value = line.split(':', 1)
if key in titles:
record[key] = value.strip()
if record_is_complete(record):
records.append(make_tuple(record))
record = {}
print records
结果:
[('Gorge', '6657', '2012-09-10 14:31:13'),
('mazu', '45', '2012-10-08 17:28:47')]
这可能会激发一个解决方案:
data = '''\
Name: Gorge
Registration ID: 6657
Registration Time: 2012-09-10 14:31:13
Somethign else: foo
Spam: Bar
Name: mazu
Registration ID: 45
Registration Time: 2012-10-08 17:28:47
Somethign else: foo
Spam: Bar'''.split('\n')
records = []
titles = ['Name','Registration ID','Registration Time']
def record_is_complete(rec):
return (rec.get('Name')
and rec.get('Registration ID')
and rec.get('Registration Time'))
def make_tuple(rec):
result = [ rec[key] for key in titles ]
return tuple(result)
record = {}
for line in data:
key, value = line.split(':', 1)
if key in titles:
record[key] = value.strip()
if record_is_complete(record):
records.append(make_tuple(record))
record = {}
print records
结果:
[('Gorge', '6657', '2012-09-10 14:31:13'),
('mazu', '45', '2012-10-08 17:28:47')]
您使用的是什么Python MySQL驱动程序?行是否总是以相同的顺序出现?是否有任何东西可以将一条记录与下一条记录区分开来,比如一条空行,或者一条记录总是以
注册ID
键开头。是的,它们总是以相同的顺序开头,并且它们以一个词开头。您使用的Python MySQL驱动程序是什么?这些行总是以相同的顺序出现吗?是否有任何东西可以将一条记录与下一条记录区分开来,比如空行,或者记录总是以注册ID
键开头。是的,它们总是以相同的顺序开头,并以@Martijn Pieters一词开头感谢您的回答,这完全有帮助。如果我想将每个文件的文件名添加到列表中,您能告诉我该怎么做吗?因此,列应该是filename、name、id、time?@user2058811:只需将它添加到记录中:record['file']=m
就在每一行record=dict.fromkeys(所有键)
之后。不要忘记将文件
值也插入数据库。:-@Martijn Pieters我在运行代码时遇到这个错误:if key==first_key和any(record.itervalues()中的值):TypeError:'bool'对象不是iterable@user2058811:很抱歉,生成器表达式不完整。@Martijn Pieters当我试图将数据转换为sql表时,出现以下错误:Traceback(最近一次调用last):文件“C:\Users\samounk\workspace\Thesis\Realdata\Datasets 1\MyTable_info.py”,第63行,记录)文件“C:\Python27\lib\site packages\MySQLdb\cursors.py”,第245行,在executemany self.errorhandler(self,TypeError,msg)文件“C:\Python27\lib\site packages\MySQLdb\connections.py”,第36行,在defaulterrorhandler raise errorclass中,需要errorvalue TypeError:float参数,而不是dict@Martijn Pieters。感谢您的回答,它完全有帮助。如果我想将每个文件的文件名添加到列表中,您能告诉我该怎么做吗?因此,列应该是filename、name、id、time?@user2058811:只需将它添加到记录中:record['file']=m
就在每一行record=dict.fromkeys(所有键)
之后。不要忘记将文件
值也插入数据库。:-@Martijn Pieters我在运行代码时遇到这个错误:if key==first_key和any(record.itervalues()中的值):TypeError:'bool'对象不是iterable@user2058811:很抱歉,生成器表达式不完整。@Martijn Pieters当我试图将数据转换为sql表时,出现以下错误:Traceback(最近一次调用last):文件“C:\Users\samounk\workspace\Thesis\Realdata\Datasets 1\MyTable_info.py”,第63行,记录)文件“C:\Python27\lib\site packages\MySQLdb\cursors.py”,第245行,在executemany self.errorhandler(self,TypeError,msg)文件“C:\Python27\lib\site packages\MySQLdb\connections.py”,第36行,在defaulterrorhandler raise errorclass中,errorvalue TypeError:需要浮点参数,而不是dict