Python 使用多个元组管理多个列表
我有大约2000行以下格式的数据:Python 使用多个元组管理多个列表,python,list,python-3.x,Python,List,Python 3.x,我有大约2000行以下格式的数据: . . [(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Ayiesha Woods'), (5, 10, 'DOB', 'July 2 , 1979'), (10, 13, 'LOC', 'Long Island'), (13, 16, 'LOC', 'New York')] [(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Craig Rivera'), (7, 12, 'DOB', 'October 1
.
.
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Ayiesha Woods'), (5, 10, 'DOB', 'July 2 , 1979'), (10, 13, 'LOC', 'Long Island'), (13, 16, 'LOC', 'New York')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Craig Rivera'), (7, 12, 'DOB', 'October 10 , 1954'), (5, 7, 'LOC', 'Manhattan')]
[(0, 1, 'Blank', ''), (0, 4, 'NAME', 'Margery Pitt Durant'), (14, 16, 'LOC', 'Flint'), (6, 11, 'DOB', 'May 24 , 1887'), (16, 18, 'LOC', 'Michigan')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Austin Watson'), (10, 13, 'LOC', 'Ann Arbor'), (13, 15, 'LOC', 'Michigan'), (4, 9, 'DOB', 'January 13 , 1992')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Gary Spatz'), (5, 8, 'LOC', 'New York'), (16, 19, 'LOC', 'New York'), (19, 21, 'LOC', 'Miami'), (21, 23, 'LOC', 'Florida'), (8, 13, 'DOB', 'April 1 , 1951')]
.
.
.
('Ayiesha Woods', 'DateOfBirth', 'July 2 , 1979')
('Craig Rivera', 'DateOfBirth', 'October 10 , 1954')
('Vance Trimble', 'DateOFBirth', 'July 6 , 1913')
它们基本上是许多列表,每个列表在各自的元组中包含姓名、dob、loc等人的详细信息
我想按以下格式提取所有人员的姓名及其对应的dob:
.
.
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Ayiesha Woods'), (5, 10, 'DOB', 'July 2 , 1979'), (10, 13, 'LOC', 'Long Island'), (13, 16, 'LOC', 'New York')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Craig Rivera'), (7, 12, 'DOB', 'October 10 , 1954'), (5, 7, 'LOC', 'Manhattan')]
[(0, 1, 'Blank', ''), (0, 4, 'NAME', 'Margery Pitt Durant'), (14, 16, 'LOC', 'Flint'), (6, 11, 'DOB', 'May 24 , 1887'), (16, 18, 'LOC', 'Michigan')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Austin Watson'), (10, 13, 'LOC', 'Ann Arbor'), (13, 15, 'LOC', 'Michigan'), (4, 9, 'DOB', 'January 13 , 1992')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Gary Spatz'), (5, 8, 'LOC', 'New York'), (16, 19, 'LOC', 'New York'), (19, 21, 'LOC', 'Miami'), (21, 23, 'LOC', 'Florida'), (8, 13, 'DOB', 'April 1 , 1951')]
.
.
.
('Ayiesha Woods', 'DateOfBirth', 'July 2 , 1979')
('Craig Rivera', 'DateOfBirth', 'October 10 , 1954')
('Vance Trimble', 'DateOFBirth', 'July 6 , 1913')
等等
这是我的尝试:
temp = "DateOFBirth"
results = []
for n1 in text:
for n2 in text:
if n1 is not n2:
if text[1][2] == 'NAME' and text[2][2] == 'DOB':
rel = text[1][3], temp, text[2][3]
print(rel)
results.append(rel)
仅当名称元组位于列表中的位置1,而日期元组位于列表中的位置2时(情况并非总是如此)才会输出
如果我想输出结果,而不考虑名称元组或日期元组在列表中的位置,该怎么办
编辑:
我有一个包含元组的列表,如:
text = [(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Vance Trimble'), (5, 7, 'LOC', 'Harrison'), (7, 9, 'LOC', 'Arkansas'), (9, 14, 'DOB', 'July 6 , 1913')]
我希望以以下格式提取数据:
.
.
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Ayiesha Woods'), (5, 10, 'DOB', 'July 2 , 1979'), (10, 13, 'LOC', 'Long Island'), (13, 16, 'LOC', 'New York')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Craig Rivera'), (7, 12, 'DOB', 'October 10 , 1954'), (5, 7, 'LOC', 'Manhattan')]
[(0, 1, 'Blank', ''), (0, 4, 'NAME', 'Margery Pitt Durant'), (14, 16, 'LOC', 'Flint'), (6, 11, 'DOB', 'May 24 , 1887'), (16, 18, 'LOC', 'Michigan')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Austin Watson'), (10, 13, 'LOC', 'Ann Arbor'), (13, 15, 'LOC', 'Michigan'), (4, 9, 'DOB', 'January 13 , 1992')]
[(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Gary Spatz'), (5, 8, 'LOC', 'New York'), (16, 19, 'LOC', 'New York'), (19, 21, 'LOC', 'Miami'), (21, 23, 'LOC', 'Florida'), (8, 13, 'DOB', 'April 1 , 1951')]
.
.
.
('Ayiesha Woods', 'DateOfBirth', 'July 2 , 1979')
('Craig Rivera', 'DateOfBirth', 'October 10 , 1954')
('Vance Trimble', 'DateOFBirth', 'July 6 , 1913')
我的代码:
temp = "DateOFBirth"
if text[1][2] == 'NAME' and text[4][2] == 'DOB':
rel = text[1][3], temp, text[4][3]
print(rel)
我如何做到这一点而不必硬编码,如:
text[1][2] == 'NAME' and text[4][2] == 'DOB'
这样它就可以自己搜索列表中的'NAME'和'DOB'并获得输出。将问题分解为简单的步骤:
results = []
for rec in records:
result = ["", "DateOfBirth", ""]
for item in rec:
if "NAME" in item:
result[0] = item[3]
elif "DOB" in item:
result[2] = item[3]
results.append(tuple(result))
print(results)
将问题分解为简单的步骤:
results = []
for rec in records:
result = ["", "DateOfBirth", ""]
for item in rec:
if "NAME" in item:
result[0] = item[3]
elif "DOB" in item:
result[2] = item[3]
results.append(tuple(result))
print(results)
我建议编写一个helper函数,从数据中检索信息。我还假设您使用的是元组列表
test_list = [[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Ayiesha Woods'),
(5, 10, 'DOB', 'July 2 , 1979'),
(10, 13, 'LOC', 'Long Island'),
(13, 16, 'LOC', 'New York')],
[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Craig Rivera'),
(7, 12, 'DOB', 'October 10 , 1954'),
(5, 7, 'LOC', 'Manhattan')],
[(0, 1, 'Blank', ''),
(0, 4, 'NAME', 'Margery Pitt Durant'),
(14, 16, 'LOC', 'Flint'),
(6, 11, 'DOB', 'May 24 , 1887'),
(16, 18, 'LOC', 'Michigan')],
[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Austin Watson'),
(10, 13, 'LOC', 'Ann Arbor'),
(13, 15, 'LOC', 'Michigan'),
(4, 9, 'DOB', 'January 13 , 1992')],
[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Gary Spatz'),
(5, 8, 'LOC', 'New York'),
(16, 19, 'LOC', 'New York'),
(19, 21, 'LOC', 'Miami'),
(21, 23, 'LOC', 'Florida'),
(8, 13, 'DOB', 'April 1 , 1951')]]
#Helper function
def get_person_info(lst):
person_name = list(filter(lambda x: 'NAME' in x, lst))[0][3:]
person_dob = list(filter(lambda x: 'DOB' in x, lst))[0][2:4]
return person_name + person_dob
#Use it with map
list(map(get_person_info, test_list))
输出:
[('Ayiesha Woods', 'DOB', 'July 2 , 1979'),
('Craig Rivera', 'DOB', 'October 10 , 1954'),
('Margery Pitt Durant', 'DOB', 'May 24 , 1887'),
('Austin Watson', 'DOB', 'January 13 , 1992'),
('Gary Spatz', 'DOB', 'April 1 , 1951')]
使用文本测试助手函数
:
text = [(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Vance Trimble'), (5, 7, 'LOC', 'Harrison'), (7, 9, 'LOC', 'Arkansas'), (9, 14, 'DOB', 'July 6 , 1913')]
get_person_info(text)
## ('Vance Trimble', 'DOB', 'July 6 , 1913')
您可以轻松地将“DOB”替换为“DateOFBirth”。我建议编写一个帮助函数,从数据中检索信息。我还假设您使用的是元组列表
test_list = [[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Ayiesha Woods'),
(5, 10, 'DOB', 'July 2 , 1979'),
(10, 13, 'LOC', 'Long Island'),
(13, 16, 'LOC', 'New York')],
[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Craig Rivera'),
(7, 12, 'DOB', 'October 10 , 1954'),
(5, 7, 'LOC', 'Manhattan')],
[(0, 1, 'Blank', ''),
(0, 4, 'NAME', 'Margery Pitt Durant'),
(14, 16, 'LOC', 'Flint'),
(6, 11, 'DOB', 'May 24 , 1887'),
(16, 18, 'LOC', 'Michigan')],
[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Austin Watson'),
(10, 13, 'LOC', 'Ann Arbor'),
(13, 15, 'LOC', 'Michigan'),
(4, 9, 'DOB', 'January 13 , 1992')],
[(0, 1, 'Blank', ''),
(0, 3, 'NAME', 'Gary Spatz'),
(5, 8, 'LOC', 'New York'),
(16, 19, 'LOC', 'New York'),
(19, 21, 'LOC', 'Miami'),
(21, 23, 'LOC', 'Florida'),
(8, 13, 'DOB', 'April 1 , 1951')]]
#Helper function
def get_person_info(lst):
person_name = list(filter(lambda x: 'NAME' in x, lst))[0][3:]
person_dob = list(filter(lambda x: 'DOB' in x, lst))[0][2:4]
return person_name + person_dob
#Use it with map
list(map(get_person_info, test_list))
输出:
[('Ayiesha Woods', 'DOB', 'July 2 , 1979'),
('Craig Rivera', 'DOB', 'October 10 , 1954'),
('Margery Pitt Durant', 'DOB', 'May 24 , 1887'),
('Austin Watson', 'DOB', 'January 13 , 1992'),
('Gary Spatz', 'DOB', 'April 1 , 1951')]
使用文本测试助手函数
:
text = [(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Vance Trimble'), (5, 7, 'LOC', 'Harrison'), (7, 9, 'LOC', 'Arkansas'), (9, 14, 'DOB', 'July 6 , 1913')]
get_person_info(text)
## ('Vance Trimble', 'DOB', 'July 6 , 1913')
您可以轻松地将“DOB”替换为“DateOFBirth”。您可以这样做:
temp = "DateOFBirth"
text = [(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Vance Trimble'), (5, 7, 'LOC', 'Harrison'), (7, 9, 'LOC', 'Arkansas'), (9, 14, 'DOB', 'July 6 , 1913')]
rel = []
for i in text:
if 'NAME' in i:
rel.append(i[i.index('NAME')+1])
rel.append(temp)
elif 'DOB' in i:
rel.append(i[i.index('DOB')+1])
print rel
# result:
# ['Vance Trimble', 'DateOFBirth', 'July 6 , 1913']
通过这种方式,结果独立于项“NAME”和“DOB”在元组中的位置,但仅当实际名称始终位于“tag”之后时,如此处所示:(0,3,'NAME',Vance Trimble')
,其中实际名称位于名称之后
,您可以执行以下操作:
temp = "DateOFBirth"
text = [(0, 1, 'Blank', ''), (0, 3, 'NAME', 'Vance Trimble'), (5, 7, 'LOC', 'Harrison'), (7, 9, 'LOC', 'Arkansas'), (9, 14, 'DOB', 'July 6 , 1913')]
rel = []
for i in text:
if 'NAME' in i:
rel.append(i[i.index('NAME')+1])
rel.append(temp)
elif 'DOB' in i:
rel.append(i[i.index('DOB')+1])
print rel
# result:
# ['Vance Trimble', 'DateOFBirth', 'July 6 , 1913']
通过这种方式,结果独立于项“NAME”和“DOB”在元组中的位置,但仅当实际名称始终位于“tag”之后时,如此处所示:
(0,3,'NAME',Vance Trimble')
,如果实际名称紧跟在名称之后
请给出准确的列表以试用您的代码,因为根据您提供的信息,很难重现您的代码,甚至无法理解您特别询问的内容……您的数据类型是什么?你是如何得到文本的?它的格式是什么?我不确定嵌套for循环都是关于什么的。也许你没有提到其他一些要求?我已经编辑了我的问题。使用更简单的版本,您可以提供准确的列表来试用您的代码,因为根据您提供的信息,很难重现您的代码,甚至无法理解您的具体要求……您的数据类型是什么?你是如何得到文本的?它的格式是什么?我不确定嵌套for循环都是关于什么的。也许你没有提到其他一些要求?我已经编辑了我的问题。使用更简单的版本使用赋值=
,不相等=
使用赋值=
,不相等=
在筛选行末尾的列表索引在做什么?我似乎无法理解它们。filter
不会在Python3
中返回列表;因此,您必须强制输出到一个列表中,获取第一个元素(索引为0的元组)。我是说那片。[3:]? 我可能有点笨。这是为了得到一个元组;这样我以后就可以使用+
来连接name和dob。如果我只对第三个元素进行切片,它将返回一个字符串,而不是元组。理解之光破晓。谢谢你的耐心。过滤线末端的列表索引在做什么?我似乎无法理解它们。filter
不会在Python3
中返回列表;因此,您必须强制输出到一个列表中,获取第一个元素(索引为0的元组)。我是说那片。[3:]? 我可能有点笨。这是为了得到一个元组;这样我以后就可以使用+
来连接name和dob。如果我只对第三个元素进行切片,它将返回一个字符串,而不是元组。理解之光破晓。谢谢你的耐心。