Python:基于另一个文件中的字段匹配筛选行
我用其他python代码生成了一个列表,如下所示。有几行用逗号分隔,并用单引号括起来。我正在努力根据另一个文件中的Python:基于另一个文件中的字段匹配筛选行,python,python-3.x,list,dictionary,Python,Python 3.x,List,Dictionary,我用其他python代码生成了一个列表,如下所示。有几行用逗号分隔,并用单引号括起来。我正在努力根据另一个文件中的D:列匹配筛选行,该文件只有起始数字字符 data = ['A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', 'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT', 'A:SET, B:FW.O, C:AS, D:+177232, E:+12355', 'A:SET, B:IT,
D:
列匹配筛选行,该文件只有起始数字字符
data = ['A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', 'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT', 'A:SET, B:FW.O, C:AS, D:+177232, E:+12355', 'A:SET, B:IT, C:AS, D:+368399793, E:+12355']
它看起来像是用单引号逐行引用的
[
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT',
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
我有另一个文件,其中有过滤号码,要在上面的列表中匹配/
cat fields.txt
+36
+18
#these are country prefixes
我需要将上面的列表D:列匹配到“fields.txt”文件的起始编号,并只打印这些行。由于“data”D:
col编号每次都不同,因此我需要根据它们的国家前缀进行筛选
预期产出:
[
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', ###matched as starting num +18 in D: col
'A:SET, B:IT, C:AS, D:+368399793, E:+12355' ###matched as starting num +36 in D: col
]
我已经尝试过各种各样的例子来编写一个“FOR”循环并匹配nums,但是没有成功
请帮帮我。我是Python编程新手。我认为此解决方案适合您的需要:
with open("fields.txt") as f:
codes = f.read().splitlines()
data = ['A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', \
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT', \
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355', \
'A:SET, B:IT, C:AS, D:+368399793, E:+12355']
for index, item in enumerate(data):
sub_items =item.replace(" ", "").split(",") # to remove spaces and get each individual item
for sub_item in sub_items: # you can replace this for loop with sub_items[3] if the position of D: is fixed
if(sub_item.startswith("D:")):
value = sub_item.replace("D:", "") # here you have +xxxx in the data point
# you can apply the logic here:
for code in codes:
if value.startswith(code):
print(code, value, index, data[index])
如果fields.txt
包含您在问题中提到的数字,它将打印以下行:
+18 +18700000 0 A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT
+36 +368399793 3 A:SET, B:IT, C:AS, D:+368399793, E:+12355
我认为此解决方案适合您的需要:
with open("fields.txt") as f:
codes = f.read().splitlines()
data = ['A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', \
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT', \
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355', \
'A:SET, B:IT, C:AS, D:+368399793, E:+12355']
for index, item in enumerate(data):
sub_items =item.replace(" ", "").split(",") # to remove spaces and get each individual item
for sub_item in sub_items: # you can replace this for loop with sub_items[3] if the position of D: is fixed
if(sub_item.startswith("D:")):
value = sub_item.replace("D:", "") # here you have +xxxx in the data point
# you can apply the logic here:
for code in codes:
if value.startswith(code):
print(code, value, index, data[index])
如果fields.txt
包含您在问题中提到的数字,它将打印以下行:
+18 +18700000 0 A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT
+36 +368399793 3 A:SET, B:IT, C:AS, D:+368399793, E:+12355
我认为没有必要拆分数据列表中的每个项目 你可以简单地做
data = [
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT',
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
with open("fields.txt") as f:
codes = f.read().splitlines()
required = []
for item in data:
for code in codes:
if "D:%s" %code in item:
required.append(item)
print(required)
你最终会得到
[
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
我认为没有必要拆分数据列表中的每个项目 你可以简单地做
data = [
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT',
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
with open("fields.txt") as f:
codes = f.read().splitlines()
required = []
for item in data:
for code in codes:
if "D:%s" %code in item:
required.append(item)
print(required)
你最终会得到
[
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
您可以使用包含的if
条件执行此操作。这样做的好处是,决定包含或排除哪一行的逻辑可以很好地隐藏在单独的函数中(匹配下面示例中的
)
拥有一个单独的函数使其非常可测试,您可以添加一个docstring,从而使其更易于维护
data = [
"A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT",
"A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT",
"A:SET, B:FW.O, C:AS, D:+177232, E:+12355",
"A:SET, B:IT, C:AS, D:+368399793, E:+12355",
]
def load_codes():
with open("fields.txt") as fieldfile:
codes = fieldfile.read().splitlines()
return codes
def matches(row, codes):
for code in codes:
if "D:%s" % code in row:
return True
return False
def main():
codes = load_codes()
filtered = [row for row in data if matches(row, codes)]
for row in filtered:
print(row)
if __name__ == "__main__":
main()
您可以使用包含的if
条件执行此操作。这样做的好处是,决定包含或排除哪一行的逻辑可以很好地隐藏在单独的函数中(匹配下面示例中的
)
拥有一个单独的函数使其非常可测试,您可以添加一个docstring,从而使其更易于维护
data = [
"A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT",
"A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT",
"A:SET, B:FW.O, C:AS, D:+177232, E:+12355",
"A:SET, B:IT, C:AS, D:+368399793, E:+12355",
]
def load_codes():
with open("fields.txt") as fieldfile:
codes = fieldfile.read().splitlines()
return codes
def matches(row, codes):
for code in codes:
if "D:%s" % code in row:
return True
return False
def main():
codes = load_codes()
filtered = [row for row in data if matches(row, codes)]
for row in filtered:
print(row)
if __name__ == "__main__":
main()
谢谢。。它符合我的要求,但我更喜欢使用其他方法,尽管我不确定哪种方法是最好的。。再次感谢。非常感谢。。它符合我的要求,但我更喜欢使用其他方法,尽管我不确定哪种方法是最好的。。再次谢谢,谢谢你,彼得。这很有帮助,完全符合我的要求。谢谢dennohpeter。这很有帮助,完全符合我的要求。