Python 如何在dataframe中连接多个文本字段

Python 如何在dataframe中连接多个文本字段,python,pandas,Python,Pandas,如何将数据框中某些文本列的唯一值合并到单个列中。 例如: data = [[1,"US","California","Los Angeles"], [1,"US","California","San Francisco"], [1,"US","California","San Diego"], [1,"US","Texas","Austin"], [2,"IND","Maharashtra","Mumbai"], [

如何将数据框中某些文本列的唯一值合并到单个列中。 例如:

data = [[1,"US","California","Los Angeles"],
        [1,"US","California","San Francisco"],
        [1,"US","California","San Diego"],
        [1,"US","Texas","Austin"],
        [2,"IND","Maharashtra","Mumbai"],
        [2,"IND","Maharashtra","Pune"],
        [2,"IND","Maharashtra","Nagpur"]]

df = pd.DataFrame(data, columns = ['Country_Id', 'Country','State','Place'])
从上面的数据框中,如何生成一个字段为
Country\u Id
的输出,第二个字段为包含
Country
State
Place
唯一值的文本字段的输出

比如:

    < > 1、美国加利福尼亚、德克萨斯、洛杉矶、旧金山、圣地亚哥、奥斯丁<
  • 印度马哈拉施特拉邦孟买浦那布尔

请忽略组合文本字段的含义

使用
groupby
apply
unique
和genexp上使用双
join

df.groupby('Country_Id').apply(lambda x: ' '.join(' '.join(x[col].unique()) for col in x))
                        .to_frame('Country-State-Place')


Out[434]:
                                                       Country-State-Place
Country_Id
1           US California Texas Los Angeles San Francisco San Diego Austin
2           IND Maharashtra Mumbai Pune Nagpur
退房