Python 比较列表,给定列中内容的顺序并不重要

Python 比较列表,给定列中内容的顺序并不重要,python,python-2.7,Python,Python 2.7,我想比较列表,而下面的代码可以做到这一点,但它并不是我想要实现的 目前它将输出: Lines only found in TEST_1: 4 6034 L LAL,LALLAL 5 4231 N AD Lines only found in TEST_2: 4 6034 L LALLAL,LAL 5 4231 N PL 6 5231 T PAL Lines match in both: 1 1231

我想比较列表,而下面的代码可以做到这一点,但它并不是我想要实现的

目前它将输出:

Lines only found in TEST_1: 
4   6034    L   LAL,LALLAL
5   4231    N   AD

Lines only found in TEST_2: 
4   6034    L   LALLAL,LAL
5   4231    N   PL
6   5231    T   PAL

Lines match in both:    
1   1231    L   LA
1   1234    L   T
2   1434    A   C
3   1634    L   T
我想要的是:

Lines only found in TEST_1: 
5   4231    N   AD

Lines only found in TEST_2: 
5   4231    N   PL
6   5231    T   PAL

Lines match in both:    
1   1231    L   LA
1   1234    L   T
2   1434    A   C
3   1634    L   T
4   6034    L   LAL,LALLAL
如何使最后一列中的数据顺序无关紧要? 使
46034 L LAL,LALL
匹配
46034 L LALL,LAL

示例代码:

TEST_1 = [['1', '1231', 'L', 'LA'],['1', '1234', 'L', 'T'],
    ['2', '1434', 'A', 'C'],['3', '1634', 'L', 'T'],
    ['4', '6034', 'L', 'LAL,LALLAL'],['5', '4231', 'N', 'AD']]

TEST_2 = [['1', '1231', 'L', 'LA'],['1', '1234', 'L', 'T'],
    ['2', '1434', 'A', 'C'],['3', '1634', 'L', 'T'],
    ['4', '6034', 'L', 'LALLAL,LAL'],['5', '4231', 'N', 'PL'],
    ['6', '5231', 'T', 'PAL']]

MATCH_1 = []
MATCH_2 = []
NO_MATCH_1 = []
NO_MATCH_2 = []

ENTRY = [TEST_1, TEST_2, MATCH_1, MATCH_2, NO_MATCH_1, NO_MATCH_2]

for i in range(0, 2):
    for word in ENTRY[i]:
        if word not in ENTRY[1-i]:
            ENTRY[4+i].append(word)
        else:
            ENTRY[2+i].append(word)

print 'Lines only found in TEST_1:\t'
for word in ENTRY[4]:
    print '\t'.join(word)

print '\nLines only found in TEST_2:\t'
for word in ENTRY[5]:
    print '\t'.join(word)

print '\nLines match in both:\t'
for word in ENTRY[2]:
    print '\t'.join(word)
给定

您希望它们是集合,以便可以执行集合操作(
{1,2,3,4}-{3,4,5,6}=={1,2}
),因此请将它们设置为集合:

TEST_2 = [['1', '1231', 'L', 'LA'],['1', '1234', 'L', 'T'],
    ['2', '1434', 'A', 'C'],['3', '1634', 'L', 'T'],
    ['4', '6034', 'L', 'LALLAL,LAL'],['5', '4231', 'N', 'PL'],
    ['6', '5231', 'T', 'PAL']]

TEST_1 = {tuple(frozenset(x.split(",")) for x in t) for t in TEST_1}
TEST_2 = {tuple(frozenset(x.split(",")) for x in t) for t in TEST_2}
注意,我将
['4',6034',L',LALLAL,LAL']
转换为
({'4'},{'6034'},{'L'},{'LALLAL',LAL'})
,因为
集合
没有顺序(
{'LALLAL',LAL'}=={'LAL',LALLAL'}

我还使用了
frozenset
s和
tuple
s,因为它们是不可变的,因此可以进入一个集合,
list
s和普通集合不能

然后您可以将其打印出来:

print("ONLY IN TEST 1:")
for thing in TEST_1 - TEST_2:
    print('\t'.join(",".join(x) for x in thing))

print()

print("ONLY IN TEST 2:")
for thing in TEST_2 - TEST_1:
    print('\t'.join(",".join(x) for x in thing))

print()

print("IN BOTH:")
for thing in TEST_1 & TEST_2:
    print('\t'.join(",".join(x) for x in thing))


#>>> ONLY IN TEST 1:
#>>> 5  4231    N   AD
#>>> 
#>>> ONLY IN TEST 2:
#>>> 5  4231    N   PL
#>>> 6  5231    T   PAL
#>>> 
#>>> IN BOTH:
#>>> 1  1231    L   LA
#>>> 3  1634    L   T
#>>> 4  6034    L   LAL,LALLAL
#>>> 1  1234    L   T
#>>> 2  1434    A   C

您的代码不太容易阅读,我建议您使用测试列表、匹配列表和不匹配列表。然后使用枚举。
print("ONLY IN TEST 1:")
for thing in TEST_1 - TEST_2:
    print('\t'.join(",".join(x) for x in thing))

print()

print("ONLY IN TEST 2:")
for thing in TEST_2 - TEST_1:
    print('\t'.join(",".join(x) for x in thing))

print()

print("IN BOTH:")
for thing in TEST_1 & TEST_2:
    print('\t'.join(",".join(x) for x in thing))


#>>> ONLY IN TEST 1:
#>>> 5  4231    N   AD
#>>> 
#>>> ONLY IN TEST 2:
#>>> 5  4231    N   PL
#>>> 6  5231    T   PAL
#>>> 
#>>> IN BOTH:
#>>> 1  1231    L   LA
#>>> 3  1634    L   T
#>>> 4  6034    L   LAL,LALLAL
#>>> 1  1234    L   T
#>>> 2  1434    A   C