Dict和List操作Python_Python_Arrays_List_Dictionary_Set

Dict和List操作Python

python arrays list dictionary

Dict和List操作Python,python,arrays,list,dictionary,set,Python,Arrays,List,Dictionary,Set,我有两个文件，一个有键，另一个有键和值。我必须匹配文件1的键并从文件2中提取相应的值。当所有的键和值都是纯列格式时，我可以很好地获得新文件的键和值。但我不理解当值是set/array类型时如何得到结果以列格式输入一个： 5216 3911 2 761.00 2503 1417 13 102866.00 5570 50 2 3718.00 5391 1534 3 11958.00 5015 4078 1 817.00 3430 299 1 5119.00 4504 3369 2 321

我有两个文件，一个有键，另一个有键和值。我必须匹配文件1的键并从文件2中提取相应的值。当所有的键和值都是纯列格式时，我可以很好地获得新文件的键和值。但我不理解当值是set/array类型时如何得到结果

以列格式输入一个：

5216 3911 2 761.00 
2503 1417 13 102866.00
5570 50 2 3718.00 
5391 1534 3 11958.00 
5015 4078 1 817.00 
3430 299 1 5119.00 
4504 3369 2 3218.00  
4069 4020 2 17854.00 
5164 4163 1 107.00 
3589 3026 1 7363.00

以列格式输入两个。它们是密钥对，即

col[0]

和

col[1]

都是密钥对

3350 2542
4750 2247
5305 3341

上面输入案例的输出，这对我来说是正确的

5391 1534 3 11958.00 
5015 4078 1 817.00 
3430 299 1 5119.00 
4504 3369 2 3218.00

节目

from collections import defaultdict

edges = {}
with open('Input_1.txt', 'r') as edge_data:    
    for row in edge_data:
        col = row.strip().split()
        edges[col[0], col[1]] = col[2], col[3]
#Then to do the queries, read through the first file and print out the matches:
with open('Input_2', 'r') as classified_data:
    with open ('Output', 'w') as outfile:    
    for row in classified_data:
            a,b = row.strip().split()
        c = edges.get((a,b), edges.get((b,a)))

        #print a,b, edges.get((a,b), edges.get((b,a)))
        #print a,b,c        
        outfile.write("%s %s %s\n" % (a,b,c))

上述程序对于上述给定的输入类型非常有效。但我不知道如何获得以下给定输入的操作

我知道我应该从上述程序中更改此语句，但我不知道应该更改为什么

edges[col[0], col[1]] = col[2], col[3]

新输入一

('3350', '2542') [6089.0, 4315.0] 
('2655', '1411') [559.0, 1220.0, 166.0, 256.0, 146.0, 528.0, 1902.0, 880.0, 2317.0, 2868.0] 
('4212', '1613') [150.0, 14184.0, 4249.0, 1250.0, 10138.0, 4281.0, 2846.0, 2205.0, 1651.0, 335.0, 5233.0, 149.0, 6816.0] 
('4750', '2247') [3089.0] 
('5305', '3341') [13122.0]

新的输入两个都是密钥对，即

col[0]

和

col[1]

都是密钥对

3350 2542
4750 2247
5305 3341

预期产量为

3350 2542 6089.0
3350 2542 4315.0
4750 2247 3089.0
5305 3341 13122.0

使用模式匹配

import re
rec = re.compile(r"\('(\d+)',\s*'(\d+)'\)\s*\[(.*)\]")
matches = rec.match("('3350', '2542') [6089.0, 4315.0]")
print matches.groups()
print int(matches.group(1))
print int(matches.group(2))
print map(float, matches.group(3).split(','))

输出是

('3350', '2542', '6089.0, 4315.0')
3350
2542
[6089.0, 4315.0]

保存数据

a = int(matches.group(1))
b = int(matches.group(2))
data = map(float, matches.group(3).split(','))
edges[a,b] = data

获取数据并打印输出

c = edges.get((a,b), edges.get((b,a)))
for value in c:
   print "%s %s %s\n" % (a,b, value)

我建议在另一个字符上拆分字符串，比如说

）'

所以你可以这样做：

with open('Input_1.txt', 'r') as edge_data:    
    for row in edge_data:
        col = row.strip().split(')')

然后，您需要将元组和列表的字符串表示形式转换为可以使用的形式。您可以使用

eval（）

现在，您有了一个字典

edges

，其中的键与file one中的元组匹配，值包含相关列表

当您读入文件2时，您将需要添加另一个循环来迭代列表中的条目。比如说

for c in edges[(a,b)]:
    outfile.write("%s %s %s\n" % (a,b,c))

这将允许您为从第一个文件中读入的列表中的每个条目在输出文件中写入一行。

我认为@three_Pinepples的

eval

方式非常好和出色

下面是一个仅操作字符串的替代方法：

edges = {}
with open("Input_1.txt", "r") as edge_data:
    for row in edge_data:
        k, v = row.strip().split(")") # split as key, value
        k = " ".join(i.strip("'") for i in k.strip("(").split(", ")) # clean unwanted symbol and merge together
        v = v.strip(" []").split(", ") # get list value
        edges[k] = v

with open("Input_2", "r") as classified_data:
    for row in classified_data:
        k = row.strip();
        for v in edges.get(k, []):
            print k, v

实际输出是什么？@TimCastelijns包含了更多的细节。如果您不理解或不想回答，请向所有人发出请求，然后请不要打扰，但至少不要放置错误/关闭标志。这对你来说可能不重要，但解决这个问题的方法对我来说真的很重要。如果有人觉得不清楚你在问什么，他们可以自由地标记你的问题。它将进入审查队列，社区中更有经验的用户将决定该标志是否公正。问题的内容是否对任何人都重要是另一回事。一句话：不要担心，谢谢你的回复，但是

3350'，2542'

因为pair是键，

6089.0

是pair的一个值，

4315.0

是同一对的第二个值。是的，就像在你以前的代码中使用的

col[0]，col[1]

作为元组使用

group（1），group（2）

作为元组：D（我已更新答案，值为列表）哦，只需迭代映射值列表。当你在边上找到一个元组时，迭代所有值并打印这很好。这正是我一直在努力的。Thanks@SitzBlogz你到底发现什么让你困惑？我故意没有给你一个答案，你可以复制/粘贴，这样你就可以学到一些东西。