Python 3.x 删除逗号并取消列出数据帧
背景 我有以下示例Python 3.x 删除逗号并取消列出数据帧,python-3.x,pandas,list,dataframe,nlp,Python 3.x,Pandas,List,Dataframe,Nlp,背景 我有以下示例df: import pandas as pd df = pd.DataFrame({'Before' : [['there, are, many, different'], ['i, like, a, lot, of, sports '], ['the, middle, east, has, many']],
df
:
import pandas as pd
df = pd.DataFrame({'Before' : [['there, are, many, different'],
['i, like, a, lot, of, sports '],
['the, middle, east, has, many']],
'After' : [['in, the, bright, blue, box'],
['because, they, go, really, fast'],
['to, ride, and, have, fun'] ],
'P_ID': [1,2,3],
'Word' : ['crayons', 'cars', 'camels'],
'N_ID' : ['A1', 'A2', 'A3']
})
输出
After Before N_ID P_ID Word
0 [in, the, bright, blue, box] [there, are, many, different] A1 1 crayons
1 [because, they, go, really,fast] [i, like, a, lot, of, sports ] A2 2 cars
2 [to, ride, and, have, fun] [the, middle, east, has, many] A3 3 camels
After Before N_ID P_ID Word
0 in the bright blue box there are many different A1 1 crayons
1 because they go really fast i like a lot of sports A2 2 cars
2 to ride and have fun the middle east has many A3 3 camels
所需输出
After Before N_ID P_ID Word
0 [in, the, bright, blue, box] [there, are, many, different] A1 1 crayons
1 [because, they, go, really,fast] [i, like, a, lot, of, sports ] A2 2 cars
2 [to, ride, and, have, fun] [the, middle, east, has, many] A3 3 camels
After Before N_ID P_ID Word
0 in the bright blue box there are many different A1 1 crayons
1 because they go really fast i like a lot of sports A2 2 cars
2 to ride and have fun the middle east has many A3 3 camels
问题
如何获得所需的输出,即1)未列出和2)删除逗号了吗
我试着没有用正如你所确认的,解决方案很简单。对于一列:
df.After.str[0].str.replace(',', '')
Out[2821]:
0 in the bright blue box
1 because they go really fast
2 to ride and have fun
Name: After, dtype: object
对于具有列表的所有列,您需要使用apply
并按如下方式重新分配:
df.loc[:, ['After', 'Before']] = df[['After', 'Before']].apply(lambda x: x.str[0].str.replace(',', ''))
Out[2824]:
After Before N_ID P_ID Word
0 in the bright blue box there are many different A1 1 crayons
1 because they go really fast i like a lot of sports A2 2 cars
2 to ride and have fun the middle east has many A3 3 camels
是否每个单元格都只有一个列表,其中包含一个带逗号的字符串?是的,这是正确的。但仅适用于
后
和前
列。在这种情况下,它比您链接的列更简单。检查我的答案哈!列之后的的前10行已经是单个字符串。根本没有清单。如果没有更多的数据正确地表示您所描述的内容,我就无法理解。你能不能编辑这篇文章,从你的真实数据中添加更多的样本,比如df[['After','Before']]的输出。我刚刚检查了值。在那个问题上你已经得到了一些很好的答案。这些答案对你很有用!