Python 将列值连接到数据帧_Python_Pandas

Python 将列值连接到数据帧

python pandas

Python 将列值连接到数据帧,python,pandas,Python,Pandas,您好，我有一个值的dateframe1，我想通过连接原始dataframe1中列的值将其转换为新的dataframe2，即 dataframe1 ProductName Value otherValue Product1 2 5 Product2 3 2 Product1 1 5 Product3 4 7 Product3 5 7 Product1 5 5 Product2 9

您好，我有一个值的dateframe1，我想通过连接原始dataframe1中列的值将其转换为新的dataframe2，即

dataframe1
ProductName  Value otherValue
Product1      2     5
Product2      3     2
Product1      1     5
Product3      4     7
Product3      5     7
Product1      5     5
Product2      9     2

dataframe2
ProductName  Value     otherValue
Product1      2 1 5       5
Product2      3 9         2
Product3      4 5         7

您可以按

ProductName

分组并使用

''进行聚合。在Value
上加入，在otherValue
上首先加入：
result = df.assign().groupby('ProductName', as_index=False).agg({ 'Value' : lambda x : ' '.join(map(str, x)), 'otherValue' : 'first' } )

print(result)

输出
  ProductName  Value  otherValue
0    Product1  2 1 5           5
1    Product2    3 9           2
2    Product3    4 5           7

请注意，此解决方案假定列值不是字符串，否则您可以使用''。直接加入您可以按ProductName
分组并使用''进行聚合。在值上加入，在其他值上首先加入：
result = df.assign().groupby('ProductName', as_index=False).agg({ 'Value' : lambda x : ' '.join(map(str, x)), 'otherValue' : 'first' } )

print(result)

输出
  ProductName  Value  otherValue
0    Product1  2 1 5           5
1    Product2    3 9           2
2    Product3    4 5           7

请注意，此解决方案假定列值不是字符串，否则您可以使用''。直接连接
您可以在两行中尝试此操作。首先，我们需要将列Value
转换为字符串，以便执行连接和操作，第二个是返回所需输出的所有操作：
import pandas as pd
import numpy as np 
df = pd.DataFrame(data={'ProductName':['Product1','Product2','Product1','Product3','Product3','Product1','Product2'],'Value':[2,3,1,4,5,5,9],'otherValue':[5,2,5,7,7,5,2]})
df['Value'] = df['Value'].astype(str)
df = df.merge(df.groupby('ProductName',as_index=True)['Value'].apply(' '.join).reset_index(),how='left',left_on='ProductName',right_on='ProductName').drop('Value_x',axis=1).drop_duplicates().rename(columns={'Value_y':'Value'})

打印（df）
输出：
你可以在两行中尝试这个。首先，我们需要将列Value
转换为字符串，以便执行连接和操作，第二个是返回所需输出的所有操作：
import pandas as pd
import numpy as np 
df = pd.DataFrame(data={'ProductName':['Product1','Product2','Product1','Product3','Product3','Product1','Product2'],'Value':[2,3,1,4,5,5,9],'otherValue':[5,2,5,7,7,5,2]})
df['Value'] = df['Value'].astype(str)
df = df.merge(df.groupby('ProductName',as_index=True)['Value'].apply(' '.join).reset_index(),how='left',left_on='ProductName',right_on='ProductName').drop('Value_x',axis=1).drop_duplicates().rename(columns={'Value_y':'Value'})

打印（df）
输出：
.agg
函数的良好扩展。.agg
函数的良好扩展。