Python 包含数组的系列_Python_Pandas

Python 包含数组的系列

python pandas

Python 包含数组的系列,python,pandas,Python,Pandas,我有一个pandas dataframe列，看起来有点像： Out[67]: 0 ["cheese", "milk... 1 ["yogurt", "cheese... 2 ["cheese", "cream"... 3 ["milk", "cheese"... 现在，我最终希望这是一个简单的列表，但在尝试将其扁平化时，我注意到熊猫将[“奶酪”、“牛奶”、“奶油”]视为str而不是list 我将如何将其展平，从而最终得到： ["cheese", "mil

我有一个pandas dataframe列，看起来有点像：

Out[67]:
0      ["cheese", "milk...
1      ["yogurt", "cheese...
2      ["cheese", "cream"...
3      ["milk", "cheese"...

现在，我最终希望这是一个简单的列表，但在尝试将其扁平化时，我注意到熊猫将

[“奶酪”、“牛奶”、“奶油”]

视为

str

而不是

list

我将如何将其展平，从而最终得到：

["cheese", "milk", "yogurt", "cheese", "cheese"...]

[编辑] 因此，下面给出的答案似乎是：

s=pd.系列（[“[”奶酪“，”牛奶“，“[”酸奶“，”奶酪“，“[”奶酪“，”奶油“]））

这很好，问题回答了，答案被接受了，但我觉得这是一个相当不雅观的解决方案。

要将列值从str转换为list，可以使用

df.columnName.tolist（）

进行展平，可以使用

df.columnName.values.flatte（）

，然后使用平嵌套的

列表

编辑：

您可以尝试：

import pandas as pd

s = pd.Series(["['cheese', 'milk']", "['yogurt', 'cheese']", "['cheese', 'cream']"])

#remove []
s = s.str.strip('[]')
print s
0      'cheese', 'milk'
1    'yogurt', 'cheese'
2     'cheese', 'cream'
dtype: object

df = s.str.split(',', expand=True)
#remove ' and strip empty string
df = df.applymap(lambda x: x.replace("'", '').strip())
print df
        0       1
0  cheese    milk
1  yogurt  cheese
2  cheese   cream

l = df.values.flatten()
print l.tolist()
['cheese', 'milk', 'yogurt', 'cheese', 'cheese', 'cream']

您可以将

系列

转换为

数据帧

，然后调用

堆栈

：

s.apply(pd.Series).stack().tolist()

No的可能重复，它不是重复的，因为列的

类型

是

字符串

不是

列表

我认为

df.values.a.flatte（）

应该是

df.a.values.flatte（）

这只是为我打印每个字母：

s=pd.Series（[“['cheese'，'milk']），”[“酸奶”、“奶酪”]、“[“奶酪”、“奶油”]”）

l=s.values.flatte（）

print（[item for sublist in l for item in l for item in sublist]）

我不能否认它是如此有效，谢谢你的帮助。我有点惊讶，尽管答案如此笨拙，但它返回的字符串列表中包含['milk'，cheese]

s=pd.Series（[“[”奶酪“，”牛奶“，“[”酸奶“，”奶酪“，”[”奶酪“，”奶油“）”）

s.apply（pd.Series）.stack（）.tolist（）

从最初的描述来看，我认为这是

系列的类型，而系列是一个字符串列表：s2=pd.Series（[[”奶酪“，”牛奶“，[”酸奶“，”奶酪“，”，[”奶酪“，”奶油“]））
，在这种情况下，s2.apply（pd.Series）.stack（）.tolist（）
应该可以工作。如果Series
的类型是表示字符串列表的字符串，则可以添加eval:s.apply（lambda x:pd.Series（eval（x））.stack（）.tolist（）
import pandas as pd

s = pd.Series(["['cheese', 'milk']", "['yogurt', 'cheese']", "['cheese', 'cream']"])

#remove []
s = s.str.strip('[]')
print s
0      'cheese', 'milk'
1    'yogurt', 'cheese'
2     'cheese', 'cream'
dtype: object

df = s.str.split(',', expand=True)
#remove ' and strip empty string
df = df.applymap(lambda x: x.replace("'", '').strip())
print df
        0       1
0  cheese    milk
1  yogurt  cheese
2  cheese   cream

l = df.values.flatten()
print l.tolist()
['cheese', 'milk', 'yogurt', 'cheese', 'cheese', 'cream']

s.apply(pd.Series).stack().tolist()