Python groupby和join文本列
我有一个csv文件,其标题为Python groupby和join文本列,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个csv文件,其标题为text | business\u id 我想将所有与一个业务相关的文本分组 我使用了review\u data=review\u data.groupby(['business\u id'])['text'].apply(“.join) 查看数据的如下所示: text \ 0 mr hoagi institut walk doe seem like thr
text | business\u id
我想将所有与一个业务相关的文本分组
我使用了review\u data=review\u data.groupby(['business\u id'])['text'].apply(“.join)
查看数据的如下所示:
text \
0 mr hoagi institut walk doe seem like throwback...
1 excel food superb custom servic miss mario mac...
2 yes place littl date open weekend staff alway ...
business_id
0 5UmKMjUEUNdYWqANhGckJw
1 5UmKMjUEUNdYWqANhGckJw
2 5UmKMjUEUNdYWqANhGckJw
我得到了这个错误:TypeError:sequence项131:预期的字符串,找到了float
这些是第130到132行:
130 use order fair often past 2 year food get progress wors everi time order doesnt help owner alway regist rude everi time final decid im done dont think feel let inconveni order food restaur let alon one food isnt even good also insid dirti heck deliv food bmw cant buy scrub brush found golden dragon collier squar 100 time better|SQ0j7bgSTazkVQlF5AnqyQ
131 popular denni|wqu7ILomIOPSduRwoWp4AQ
132 want smth quick late night would say denni|wqu7ILomIOPSduRwoWp4AQ
我认为您需要使用beforegroupby
过滤数据:
print review_data
text business_id
0 mr hoagi 5UmKMjUEUNdYWqANhGckJw
1 excel food 5UmKMjUEUNdYWqANhGckJw
2 NaN 5UmKMjUEUNdYWqANhGckJw
3 yes place 5UmKMjUEUNdYWqANhGckJw
review_data = review_data[review_data['text'].notnull()]
print review_data
text business_id
0 mr hoagi 5UmKMjUEUNdYWqANhGckJw
1 excel food 5UmKMjUEUNdYWqANhGckJw
3 yes place 5UmKMjUEUNdYWqANhGckJw
review_data=review_data.groupby(['business_id'])['text'].apply("".join)
print review_data
business_id
5UmKMjUEUNdYWqANhGckJw mr hoagi excel food yes place
Name: text, dtype: object
review\u data=review\u data.groupby(['business\u id'])['text'].apply(“.join)
有效吗?看起来你在连接索引编号,这就是你想要的。但我在读一些行时仍然会出错:TypeError:sequence item 131:expected string,float found这意味着您缺少数据,您必须发布复制此错误和代码的示例数据