Python 删除数据帧中的列表

Python 删除数据帧中的列表,python,pandas,data-cleaning,Python,Pandas,Data Cleaning,我有以下数据帧: Index Recipe_ID order content 0 1285 1 Heat oil in a large frypan with lid over mediu... 1 1285 2 Meanwhile, add cauliflower to a pot of boiling... 2 1285 3 Remove lid from chick

我有以下数据帧:

Index   Recipe_ID   order   content
0       1285        1       Heat oil in a large frypan with lid over mediu...
1       1285        2       Meanwhile, add cauliflower to a pot of boiling...
2       1285        3       Remove lid from chicken and let simmer uncover... 
3       1289        1       To make the dressing, whisk oil, vinegar and m...
4       1289        2       Cook potatoes in a large saucepan of boiling w..
任务:我需要在一个单元格中获取内容:

df = df.groupby('recipe_variation_part_id', as_index=False).agg(lambda x: x.tolist())
这将返回以下内容:

Index   Recipe_ID   order         content
0       1285        [1, 2, 3]     [Heat oil in a large frypan with lid over medi...
1       1289        [1, 2, 3]     [To make the dressing, whisk oil, vinegar and ...
2       1297        [1, 2, 4, 3]  [Place egg in saucepan of cold water and bring...
3       1301        [1, 2]        [Preheat a non-stick frying pan and pan fry th...
4       1309        [2, 3, 4, 1]  [Meanwhile, cook noodles according to package ...
['Heat oil in a large frypan with lid over medium-high heat. Cook onions, garlic and rosemary for a couple of minutes until soft. Add chicken and brown on both sides for a few minutes, then add in tomatoes and olives. Season with salt and pepper and allow to simmer with lid on for 20-25 minutes. ',
 'Meanwhile, add cauliflower to a pot of boiling water and cook for 10 minutes or until soft. Drain and then mash and gently fold in olive oil, parmesan, salt and pepper. ',
 'Remove lid from chicken and let simmer uncovered for five minutes more. Sprinkle with parsley then serve with cauliflower mash. ']
如果查看第一个配方条目,您将得到以下信息:

Index   Recipe_ID   order         content
0       1285        [1, 2, 3]     [Heat oil in a large frypan with lid over medi...
1       1289        [1, 2, 3]     [To make the dressing, whisk oil, vinegar and ...
2       1297        [1, 2, 4, 3]  [Place egg in saucepan of cold water and bring...
3       1301        [1, 2]        [Preheat a non-stick frying pan and pan fry th...
4       1309        [2, 3, 4, 1]  [Meanwhile, cook noodles according to package ...
['Heat oil in a large frypan with lid over medium-high heat. Cook onions, garlic and rosemary for a couple of minutes until soft. Add chicken and brown on both sides for a few minutes, then add in tomatoes and olives. Season with salt and pepper and allow to simmer with lid on for 20-25 minutes. ',
 'Meanwhile, add cauliflower to a pot of boiling water and cook for 10 minutes or until soft. Drain and then mash and gently fold in olive oil, parmesan, salt and pepper. ',
 'Remove lid from chicken and let simmer uncovered for five minutes more. Sprinkle with parsley then serve with cauliflower mash. ']
这就是我想要的,但我需要去掉方括号

数据类型=列表

我试过:

df.applymap(lambda x: x[0] if isinstance(x, list) else x)
df['content'].str.replace(']', '')
df['content'].str.replace(r'(\[\[(?:[^\]|]*\|)?([^\]|]*)\]\])', '')
df['content'].str.get(0)
仅返回第一个条目,而不是每个步骤

我试过:

df.applymap(lambda x: x[0] if isinstance(x, list) else x)
df['content'].str.replace(']', '')
df['content'].str.replace(r'(\[\[(?:[^\]|]*\|)?([^\]|]*)\]\])', '')
df['content'].str.get(0)
只返回南

我试过:

df.applymap(lambda x: x[0] if isinstance(x, list) else x)
df['content'].str.replace(']', '')
df['content'].str.replace(r'(\[\[(?:[^\]|]*\|)?([^\]|]*)\]\])', '')
df['content'].str.get(0)
只返回南

我试过:

df.applymap(lambda x: x[0] if isinstance(x, list) else x)
df['content'].str.replace(']', '')
df['content'].str.replace(r'(\[\[(?:[^\]|]*\|)?([^\]|]*)\]\])', '')
df['content'].str.get(0)
仅返回第一个条目

任何帮助都将不胜感激


如果您需要任何进一步的信息,请告诉我。

我创建了一个小示例,可以为您解决此问题:

import pandas as pd
df = pd.DataFrame({'order': [1, 1, 2], 'content': ['hello', 'world', 'sof']})
df
Out[4]: 
   order content
0      1   hello
1      1   world
2      2     sof
df.groupby(by=['order']).agg(lambda x: ' '.join(x))
Out[5]: 
           content
order             
1      hello world
2              sof

所以就像你在问题的第5行所做的那样,你使用了
'.join(x)
而不是
tolist()
,这将把所有的东西都放在一个大字符串中,而不是字符串列表中,因此,没有
[]

这非常有效!非常感谢。很快就要投票了