Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从读取文件行查找集合中的元素_Python_Python 2.7 - Fatal编程技术网

Python 从读取文件行查找集合中的元素

Python 从读取文件行查找集合中的元素,python,python-2.7,Python,Python 2.7,我有带分隔符的文本文件|:file1.txt ID|Name|Date 1|A|2017-12-19 2|B|2017-12-20 3|C|2017-12-21 和下面的设置: 我只想从set到file中找到匹配的元素,并将该记录从file1.txt写入output.txt 预期输出:Output.txt应获得以下数据 ID|Name|Date 1|A|2017-12-19 2|B|2017-12-20 如果您愿意使用第三方库,您可以使用熊猫: import pandas as

我有带分隔符的文本文件
|
:file1.txt

ID|Name|Date
1|A|2017-12-19   
2|B|2017-12-20
3|C|2017-12-21
和下面的
设置

我只想从set到file中找到匹配的元素,并将该记录从file1.txt写入output.txt

预期输出:
Output.txt
应获得以下数据

ID|Name|Date
1|A|2017-12-19   
2|B|2017-12-20

如果您愿意使用第三方库,您可以使用熊猫:

import pandas as pd
from io import StringIO

mystr = StringIO("""ID|Name|Date
1|A|2017-12-19
2|B|2017-12-20
3|C|2017-12-21""")

# replace mystr with 'file1.txt'
df = pd.read_csv(mystr, sep='|')

# criteria
id_set = {'1', '2'}
date_set = {'2017-12-19', '2017-12-20'}

# apply criteria
df2 = df[df['ID'].astype(str).isin(id_set) | df['Date'].isin(date_set)]

print(df2)

#   ID Name        Date
# 0  1    A  2017-12-19
# 1  2    B  2017-12-20

# export to csv
df2.to_csv('file1_out.txt', sep='|')

如果您愿意使用第三方库,您可以使用熊猫:

import pandas as pd
from io import StringIO

mystr = StringIO("""ID|Name|Date
1|A|2017-12-19
2|B|2017-12-20
3|C|2017-12-21""")

# replace mystr with 'file1.txt'
df = pd.read_csv(mystr, sep='|')

# criteria
id_set = {'1', '2'}
date_set = {'2017-12-19', '2017-12-20'}

# apply criteria
df2 = df[df['ID'].astype(str).isin(id_set) | df['Date'].isin(date_set)]

print(df2)

#   ID Name        Date
# 0  1    A  2017-12-19
# 1  2    B  2017-12-20

# export to csv
df2.to_csv('file1_out.txt', sep='|')

您可以尝试此解决方案:

id_set = {'1','2'}
date_set = {'2017-12-19', '2017-12-20'}

# open files for reading and writing
with open('file.txt') as in_file, open('output.txt', 'w') as out_file:

    # write headers
    out_file.write(next(in_file))

    # go over lines in file
    for line in in_file:

        # extract id and date
        id, _, date = line.rstrip().split('|')

        # keep lines have an id or date in the sets
        if id in id_set or date in date_set:
            out_file.write(line)
这将提供以下output.txt:


您可以尝试此解决方案:

id_set = {'1','2'}
date_set = {'2017-12-19', '2017-12-20'}

# open files for reading and writing
with open('file.txt') as in_file, open('output.txt', 'w') as out_file:

    # write headers
    out_file.write(next(in_file))

    # go over lines in file
    for line in in_file:

        # extract id and date
        id, _, date = line.rstrip().split('|')

        # keep lines have an id or date in the sets
        if id in id_set or date in date_set:
            out_file.write(line)
这将提供以下output.txt:


如果
id|set={1}
date|set=set()
,输出是
1|A|2017-12-19
还是空文件?@Aran-Fey,输出是
1|A | 2017-12-19
。如果
id|set={1}
date u-set=set()
,输出是
1 | A | A1 | A | 2017-12-19
。从OP对Aran Fey的回复来看,他们似乎在寻找
,而不是
。(另外,您至少缺少一个
#
,因此此代码是发布的语法错误…@abarnert-Cheers,由于某种原因完全遗漏了这一点。修正了代码,使其更加合理。从OP对Aran Fey的回复来看,他们似乎在寻找
,而不是
。(另外,您至少缺少一个
#
,因此此代码是发布的语法错误…@abarnert-Cheers,由于某种原因完全遗漏了这一点。修正了代码,使其更加合理。