Python 字符串(file1.txt)从file2.txt搜索
Python 字符串(file1.txt)从file2.txt搜索,python,string,search,Python,String,Search,file1.txt包含用户名,即 tony peter john ... file2.txt包含用户详细信息,每个用户详细信息只有一行,即 alice 20160102 1101 abc john 20120212 1110 zjc9 mary 20140405 0100 few3 peter 20140405 0001 io90 tango 19090114 0011 n4-8 tony 20150405 1001 ewdf zoe 20000211 01
file1.txt
包含用户名,即
tony
peter
john
...
file2.txt
包含用户详细信息,每个用户详细信息只有一行,即
alice 20160102 1101 abc
john 20120212 1110 zjc9
mary 20140405 0100 few3
peter 20140405 0001 io90
tango 19090114 0011 n4-8
tony 20150405 1001 ewdf
zoe 20000211 0111 jn09
...
我想通过file1.txt
用户提供的file2.txt
获取用户详细信息的短名单,即
john 20120212 1110 zjc9
peter 20140405 0001 io90
tony 20150405 1001 ewdf
如何使用python实现这一点?您可以使用
pandas
:
import pandas as pd
df1 = pd.read_csv('df1.txt', header=None)
df2 = pd.read_csv('df2.txt', header=None)
df1[0] = df1[0].str.strip() # remove the 2 whitespace followed by the feild
df2 = df2[0].str[0:-2].str.split(' ').apply(pd.Series) # split the word and remove whitespace
df = df1.merge(df2)
Out[26]:
0 1 2 3
0 tony 20150405 1001 ewdf
1 peter 20140405 0001 io90
2 john 20120212 1110 zjc9
import pandas as pd
file1 = pd.read_csv('file1.txt', sep =' ', header=None)
file2 = pd.read_csv('file2.txt', sep=' ', header=None)
shortlist = file2.loc[file2[0].isin(file1.values.T[0])]
它将为您提供以下结果:
0 1 2 3
1 john 20120212 1110 zjc9
3 peter 20140405 1 io90
5 tony 20150405 1001 ewdf
上面是一个数据帧
,只需使用短名单即可将其转换回数组。值
您可以使用.split(“”)
,假设名称与文件2.txt中的其他信息之间始终存在空格
下面是一个例子:
UserList = []
with open("file1.txt","r") as fuser:
UserLine = fuser.readline()
while UserLine!='':
UserList.append(UserLine.split("\n")[0]) # Separate the user name from the new line command in the text file.
UserLine = fuser.readline()
InfoUserList = []
InfoList = []
with open("file2.txt","r") as finfo:
InfoLine = finfo.readline()
while InfoLine!='':
InfoList.append(InfoLine)
line1 = InfoLine.split(' ')
InfoUserList.append(line1[0]) # Take just the user name to compare it later
InfoLine = finfo.readline()
for user in UserList:
for i in range(len(InfoUserList)):
if user == InfoUserList[i]:
print InfoList[i]
如果每行以四个空格开头,它将以代码格式呈现,或者您可以使用{}按钮在标记编辑器中格式化突出显示的代码。因此,这既不是代码编写服务,也不是教程服务。请学习。请阅读和。如果在编程时遇到一些错误,请提问。