Python 比较了2个文件中的数据
我刚开始对任何困惑感到抱歉 我有两个文件。文件A中有我感兴趣的样本名称列表。文件B包含所有样本的数据Python 比较了2个文件中的数据,python,compare,Python,Compare,我刚开始对任何困惑感到抱歉 我有两个文件。文件A中有我感兴趣的样本名称列表。文件B包含所有样本的数据 File A (no headers) sample_A sample_XA sample_12754 samples_75t File B name description etc ..... sample_JA mm 0.01 0.1 1.2 0.018 e
File A (no headers)
sample_A
sample_XA
sample_12754
samples_75t
File B
name description etc .....
sample_JA mm 0.01 0.1 1.2 0.018 etc
sample_A mm 0.001 1.2 0.8 1.4 etc
sample_XA hu 0.4 0.021 0.14 2.34 etc
samples_YYYY RN 0.0001 3.435 1.1 0.01 etc
sample_12754 mm 0.1 0.1 0.87 0.54 etc
sample_2248333 hu 0.43 0.01 0.11 2.32 etc
samples_75t mm 0.3 0.02 0.14 2.34 etc
我想比较文件A和文件B,并从B输出数据,但只针对A中列出的示例名称
我试过这个
#!/usr/bin/env python2
import csv
count = 0
import collections
samples = collections.defaultdict(list)
with open('FILEA.txt') as d:
sites = [l.strip() for l in f if l.strip()]
###This gives me the correct list of samples for file A.
with open('FILEB','r') as inF:
for line in inF:
elements = line.split()
if sites.intersection(elements):
count += 1
print (elements)
##这里我得到了文件B中所有样本的名称,只有名称。我想要文件B中的数据,但只需要A中的样本
然后我试着使用和交叉
#!/usr/bin/env python2
import sys
import csv
import collections
samples = collections.defaultdict(list)
with open('FILEA.txt','r') as f:
nsamples = [l.strip() for l in f if l.strip()]
print (nsamples)
with open ('FILEB','r') as inF:
for row in inF:
elements = row.split()
if nsamples.intersection(elements):
print(row[0,:])
还是不行
What do I have to do to get the output data as follows:
name description etc .....
sample_A mm 0.001 1.2 0.8 1.4 etc
sample_XA hu 0.4 0.021 0.14 2.34 etc
sample_12754 mm 0.1 0.1 0.87 0.54 etc
sample_75t mm 0.3 0.02 0.14 2.34 etc
任何想法都将不胜感激。谢谢。从
filea
中创建一组行,然后将fileb
中的每一行拆分一次,查看第一个元素是否在filea
中的数据集中:
with open("filea") as f, open("fileb") as f2:
# male set of lines stripping newlines
# so we can compare properly later i.e foo\n != foo
st = set(map(str.rstrip, f)) # itertools.imap python2
for line in f2:
# split once and extract first element to compare
if line.strip() and line.split(None, 1)[0] in st:
print(line.rstrip())
输出:
sample_A mm 0.001 1.2 0.8 1.4 etc
sample_XA hu 0.4 0.021 0.14 2.34 etc
sample_12754 mm 0.1 0.1 0.87 0.54 etc
samples_75t mm 0.3 0.02 0.14 2.34 etc
@user5511186如果您找到了适合您的解决方案,请不要忘记单击答案左侧的灰色复选标记,将其标记为已接受。谢谢