Python 从读取文件行查找集合中的元素
我有带分隔符的文本文件Python 从读取文件行查找集合中的元素,python,python-2.7,Python,Python 2.7,我有带分隔符的文本文件|:file1.txt ID|Name|Date 1|A|2017-12-19 2|B|2017-12-20 3|C|2017-12-21 和下面的设置: 我只想从set到file中找到匹配的元素,并将该记录从file1.txt写入output.txt 预期输出:Output.txt应获得以下数据 ID|Name|Date 1|A|2017-12-19 2|B|2017-12-20 如果您愿意使用第三方库,您可以使用熊猫: import pandas as
|
:file1.txt
ID|Name|Date
1|A|2017-12-19
2|B|2017-12-20
3|C|2017-12-21
和下面的设置
:
我只想从set到file中找到匹配的元素,并将该记录从file1.txt写入output.txt
预期输出:Output.txt
应获得以下数据
ID|Name|Date
1|A|2017-12-19
2|B|2017-12-20
如果您愿意使用第三方库,您可以使用熊猫:
import pandas as pd
from io import StringIO
mystr = StringIO("""ID|Name|Date
1|A|2017-12-19
2|B|2017-12-20
3|C|2017-12-21""")
# replace mystr with 'file1.txt'
df = pd.read_csv(mystr, sep='|')
# criteria
id_set = {'1', '2'}
date_set = {'2017-12-19', '2017-12-20'}
# apply criteria
df2 = df[df['ID'].astype(str).isin(id_set) | df['Date'].isin(date_set)]
print(df2)
# ID Name Date
# 0 1 A 2017-12-19
# 1 2 B 2017-12-20
# export to csv
df2.to_csv('file1_out.txt', sep='|')
如果您愿意使用第三方库,您可以使用熊猫:
import pandas as pd
from io import StringIO
mystr = StringIO("""ID|Name|Date
1|A|2017-12-19
2|B|2017-12-20
3|C|2017-12-21""")
# replace mystr with 'file1.txt'
df = pd.read_csv(mystr, sep='|')
# criteria
id_set = {'1', '2'}
date_set = {'2017-12-19', '2017-12-20'}
# apply criteria
df2 = df[df['ID'].astype(str).isin(id_set) | df['Date'].isin(date_set)]
print(df2)
# ID Name Date
# 0 1 A 2017-12-19
# 1 2 B 2017-12-20
# export to csv
df2.to_csv('file1_out.txt', sep='|')
您可以尝试此解决方案:
id_set = {'1','2'}
date_set = {'2017-12-19', '2017-12-20'}
# open files for reading and writing
with open('file.txt') as in_file, open('output.txt', 'w') as out_file:
# write headers
out_file.write(next(in_file))
# go over lines in file
for line in in_file:
# extract id and date
id, _, date = line.rstrip().split('|')
# keep lines have an id or date in the sets
if id in id_set or date in date_set:
out_file.write(line)
这将提供以下output.txt:
您可以尝试此解决方案:
id_set = {'1','2'}
date_set = {'2017-12-19', '2017-12-20'}
# open files for reading and writing
with open('file.txt') as in_file, open('output.txt', 'w') as out_file:
# write headers
out_file.write(next(in_file))
# go over lines in file
for line in in_file:
# extract id and date
id, _, date = line.rstrip().split('|')
# keep lines have an id or date in the sets
if id in id_set or date in date_set:
out_file.write(line)
这将提供以下output.txt:
如果
id|set={1}
和date|set=set()
,输出是1|A|2017-12-19
还是空文件?@Aran-Fey,输出是1|A | 2017-12-19
。如果id|set={1}
和date u-set=set()
,输出是1 | A | A2017-12-19还是空文件,输出将是1 | A | 2017-12-19
。从OP对Aran Fey的回复来看,他们似乎在寻找或
,而不是和
。(另外,您至少缺少一个#
,因此此代码是发布的语法错误…@abarnert-Cheers,由于某种原因完全遗漏了这一点。修正了代码,使其更加合理。从OP对Aran Fey的回复来看,他们似乎在寻找或
,而不是和
。(另外,您至少缺少一个#
,因此此代码是发布的语法错误…@abarnert-Cheers,由于某种原因完全遗漏了这一点。修正了代码,使其更加合理。