Python 使用包含列表的列重塑数据帧
假设我有一个数据帧,看起来像这样:Python 使用包含列表的列重塑数据帧,python,pandas,nltk,Python,Pandas,Nltk,假设我有一个数据帧,看起来像这样: import pandas as pd data = [{"Name" : "Project A", "Feedback" : ['we should do x', 'went well']}, {"Name" : "Project B", "Feedback" : ['eat pop tarts', 'boo']}, {"Name" : "Project C", "Feedback" : ['bar', 'b
import pandas as pd
data = [{"Name" : "Project A", "Feedback" : ['we should do x', 'went well']},
{"Name" : "Project B", "Feedback" : ['eat pop tarts', 'boo']},
{"Name" : "Project C", "Feedback" : ['bar', 'baz']}
]
df = pd.DataFrame(data)
df = df[['Name','Feedback']]
df
Name Feedback
0 Project A ['we should do x', 'went well']
1 Project B ['eat pop tarts', 'boo']
2 Project C ['bar', 'baz']
我想做的是重塑dataframe,这样Name就是关键,Feedback列列表中的每个元素都是这样的值:
Name Feedback
0 Project A 'we should do x'
1 Project A 'went well'
2 Project B 'eat pop tarts'
3 Project B 'boo'
4 Project C 'bar'
5 Project C 'baz'
这样做的有效方法是什么 一个选项是通过展平列反馈并重复列名来重建数据帧:
一个选项是通过展平列反馈并重复列名来重建数据帧: 下面是另一种方法:
# Separate out values (NOTE- this assumes you'll always have two strings in list)
df['pos_0'] = df['Feedback'].str[0]
df['pos_1'] = df['Feedback'].str[1]
df
Name Feedback pos_0 pos_1
0 Project A [we should do x, went well] we should do x went well
1 Project B [eat pop tarts, boo] eat pop tarts boo
2 Project C [bar, baz] bar baz
期望输出:
pd.melt(df, 'Name', ['pos_0', 'pos_1'], 'Feedback').drop('Feedback', axis=1)
Name value
0 Project A we should do x
1 Project B eat pop tarts
2 Project C bar
3 Project A went well
4 Project B boo
5 Project C baz
下面是另一种方法:
# Separate out values (NOTE- this assumes you'll always have two strings in list)
df['pos_0'] = df['Feedback'].str[0]
df['pos_1'] = df['Feedback'].str[1]
df
Name Feedback pos_0 pos_1
0 Project A [we should do x, went well] we should do x went well
1 Project B [eat pop tarts, boo] eat pop tarts boo
2 Project C [bar, baz] bar baz
期望输出:
pd.melt(df, 'Name', ['pos_0', 'pos_1'], 'Feedback').drop('Feedback', axis=1)
Name value
0 Project A we should do x
1 Project B eat pop tarts
2 Project C bar
3 Project A went well
4 Project B boo
5 Project C baz
啊!!谢谢@Psidom…没有考虑列表的理解..我最初尝试使用堆栈/取消堆栈的方法出错了。您的反馈栏属于列表类型,有点不规则。所以堆栈/非堆栈方法在这里不会有多大帮助!谢谢@Psidom…没有考虑列表的理解..我最初尝试使用堆栈/取消堆栈的方法出错了。您的反馈栏属于列表类型,有点不规则。因此,堆栈/取消堆栈方法在这里没有多大帮助。这可能也是一本好读物:这也可能是一本好读物: