Python 如何在dataframe中连接多个文本字段_Python_Pandas

Python 如何在dataframe中连接多个文本字段

python pandas

Python 如何在dataframe中连接多个文本字段,python,pandas,Python,Pandas,如何将数据框中某些文本列的唯一值合并到单个列中。例如： data = [[1,"US","California","Los Angeles"], [1,"US","California","San Francisco"], [1,"US","California","San Diego"], [1,"US","Texas","Austin"], [2,"IND","Maharashtra","Mumbai"], [

如何将数据框中某些文本列的唯一值合并到单个列中。例如：

data = [[1,"US","California","Los Angeles"],
        [1,"US","California","San Francisco"],
        [1,"US","California","San Diego"],
        [1,"US","Texas","Austin"],
        [2,"IND","Maharashtra","Mumbai"],
        [2,"IND","Maharashtra","Pune"],
        [2,"IND","Maharashtra","Nagpur"]]

df = pd.DataFrame(data, columns = ['Country_Id', 'Country','State','Place'])

从上面的数据框中，如何生成一个字段为

Country\u Id

的输出，第二个字段为包含

Country

、

State

、

Place

唯一值的文本字段的输出

比如：

印度马哈拉施特拉邦孟买浦那布尔

请忽略组合文本字段的含义

使用

groupby

和

apply

在

unique

和genexp上使用双

join

df.groupby('Country_Id').apply(lambda x: ' '.join(' '.join(x[col].unique()) for col in x))
                        .to_frame('Country-State-Place')


Out[434]:
                                                       Country-State-Place
Country_Id
1           US California Texas Los Angeles San Francisco San Diego Austin
2           IND Maharashtra Mumbai Pune Nagpur

退房