Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/295.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python-vlookup在%LIKE%中_Python_Pandas - Fatal编程技术网

python-vlookup在%LIKE%中

python-vlookup在%LIKE%中,python,pandas,Python,Pandas,我是Python新手,我正在尝试连接两个由分隔符分隔的CSV文件 CSV1 Sender;Recipient Adam;123 Alex;234 John;123 Adam;888 CSV2 Name;Phone Winnie;123,234,456 Celeste;777,888,999 预期产出: Sender;Recipient;RecipientName Adam;123;Winnie Alex;234;Winnie John;123;Winnie Adam;888;Celeste

我是Python新手,我正在尝试连接两个由分隔符分隔的CSV文件

CSV1
Sender;Recipient
Adam;123
Alex;234
John;123
Adam;888

CSV2
Name;Phone
Winnie;123,234,456
Celeste;777,888,999
预期产出:

Sender;Recipient;RecipientName
Adam;123;Winnie
Alex;234;Winnie
John;123;Winnie
Adam;888;Celeste
CSV2中的电话用逗号分隔。所以当我匹配时,我需要做一些搜索或%LIKE%


我知道我可以使用join来实现vlookup类型,但是如何实现%LIKE%?

这里有一些伪代码和如何实现的想法

我将首先解析CSV2文件。跳过第一行,然后在接下来的几行中解析出姓名和电话,然后维护一个与每个电话号码关联的姓名词典

numbers_to_names = {}
for line in open("csv2", "r").splitlines():
    name, phone_numbers = line.split(";")
    for phone_number in phone_numbers.split(","):
        numbers_to_names[phone_number] = name
然后当再次浏览CSV1时,跳过第一行,然后解析出发送者和接收者,并结合前面的字典结果

for line in open("csv1", "r").splitlines():
    sender, recipient = line.split(";")
    print "%s;%s;%s" % (sender, recipient, numbers_to_names[recipient])
使用str.split将电话列转换为列表 使用str.len查找每个列表的长度。我们将使用它来分解“Name”列 将所有这些列表合并为一个。确保过滤掉长度为零的列表 使用“重复”分解“名称” 创建一个字典,其中键是电话号码,值是名称 创建d1的副本,我们使用map和我们制作的新字典在其中添加了新列。 按系列提供的解决方案:

与的解决方案类似:

df2['Phone'] = df2['Phone'].str.split(',')
df2 = df2.dropna(subset=['Phone'])

s = pd.Series(np.repeat(df2.Name.values, df2.Phone.str.len()), 
              index= list(chain.from_iterable(df2.Phone.values)))
s.index = s.index.astype(int)
s.name = 'RecipientName'
print (s)

df1 = df1.join(s, on='Recipient')
print(df1)
  Sender  Recipient RecipientName
0   Adam        123        Winnie
1   Alex        234        Winnie
2   John        123        Winnie
3   Adam        888       Celeste
编辑:

我的数据样本:

import pandas as pd
from pandas.compat import StringIO

temp=u"""
Sender;Recipient
Adam;123
Alex;234
John;123
Adam;888"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df1 = pd.read_csv(StringIO(temp), sep=";")
print (df1)
  Sender  Recipient
0   Adam        123
1   Alex        234
2   John        123
3   Adam        888

temp=u"""
Name;Phone
Winnie;123,234,456
Celeste;777,888,999"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df2 = pd.read_csv(StringIO(temp), sep=";")
print (df2)
      Name        Phone
0   Winnie  123,234,456
1  Celeste  777,888,999

你好,非常感谢。你介意稍微解释一下代码,让我也能从中学习吗?我还被困在尝试连接。它说零维数组不能被复制concatenated@chongzixin看看这是否有帮助。如何导入csv?如果我想将其应用于我自己的数据集,该怎么办?我在np.repeatdf2.Name.values,lens上失败了,它告诉我不能根据“安全”规则将数组数据从dtype'O'强制转换为dtype'int64'。镜头是数据类型:object,但我不知道如何转换。嗯,如果使用lens=lens[lens.astypebool]它能工作吗?不,当我试图打印镜头时,它仍然是数据类型:object。是否应该将其转换为数组?是否使用df1=pd.read_csv'CSV1',sep=';'和df2=pd.read_csv'CSV2',sep=';'?如果是,但仍然存在问题,那么什么是df1.info?
#map values to new column
df1['RecipientName'] = df1['Recipient'].map(s)
print(df1)
  Sender  Recipient RecipientName
0   Adam        123        Winnie
1   Alex        234        Winnie
2   John        123        Winnie
3   Adam        888       Celeste

#write to csv
df.to_csv('out.csv', sep=';', header=None)

Sender;Recipient;RecipientName
Adam;123;Winnie
Alex;234;Winnie
John;123;Winnie
Adam;888;Celeste
df2['Phone'] = df2['Phone'].str.split(',')
df2 = df2.dropna(subset=['Phone'])

s = pd.Series(np.repeat(df2.Name.values, df2.Phone.str.len()), 
              index= list(chain.from_iterable(df2.Phone.values)))
s.index = s.index.astype(int)
s.name = 'RecipientName'
print (s)

df1 = df1.join(s, on='Recipient')
print(df1)
  Sender  Recipient RecipientName
0   Adam        123        Winnie
1   Alex        234        Winnie
2   John        123        Winnie
3   Adam        888       Celeste
import pandas as pd
from pandas.compat import StringIO

temp=u"""
Sender;Recipient
Adam;123
Alex;234
John;123
Adam;888"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df1 = pd.read_csv(StringIO(temp), sep=";")
print (df1)
  Sender  Recipient
0   Adam        123
1   Alex        234
2   John        123
3   Adam        888

temp=u"""
Name;Phone
Winnie;123,234,456
Celeste;777,888,999"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df2 = pd.read_csv(StringIO(temp), sep=";")
print (df2)
      Name        Phone
0   Winnie  123,234,456
1  Celeste  777,888,999