Python 从dataframe列检索列表元素
假设有一个Python 从dataframe列检索列表元素,python,pandas,Python,Pandas,假设有一个pandas.DataFrame类似: pd.DataFrame([[np.nan,np.nan],[[1,2],[3,4]],[[11,22],[33,44]]],columns=['A','B']) 生成2个熊猫数据帧的最简单方法是什么?每个数据帧包含帧中每个值列表中的第一个和第二个元素(如果位置为nan,则为nan) 您可以使用返回每列的第n个元素的函数执行所需操作 代码: def row_element(elem_num): def func(row):
pandas.DataFrame
类似:
pd.DataFrame([[np.nan,np.nan],[[1,2],[3,4]],[[11,22],[33,44]]],columns=['A','B'])
生成2个熊猫数据帧的最简单方法是什么?每个数据帧包含帧中每个值列表中的第一个和第二个元素(如果位置为nan,则为nan)
您可以使用返回每列的第n个
元素的函数执行所需操作
代码:
def row_element(elem_num):
def func(row):
ret = []
for item in row:
try:
ret.append(item[elem_num])
except:
ret.append(item)
return ret
return func
df = pd.DataFrame(
[[np.nan, np.nan], [[1, 2], [3, 4]], [[11, 22], [33, 44]]],
columns=['A', 'B'])
print(df)
print(df.apply(row_element(0), axis=1))
print(df.apply(row_element(1), axis=1))
A B
0 NaN NaN
1 [1, 2] [3, 4]
2 [11, 22] [33, 44]
A B
0 NaN NaN
1 1.0 3.0
2 11.0 33.0
A B
0 NaN NaN
1 2.0 4.0
2 22.0 44.0
测试代码:
def row_element(elem_num):
def func(row):
ret = []
for item in row:
try:
ret.append(item[elem_num])
except:
ret.append(item)
return ret
return func
df = pd.DataFrame(
[[np.nan, np.nan], [[1, 2], [3, 4]], [[11, 22], [33, 44]]],
columns=['A', 'B'])
print(df)
print(df.apply(row_element(0), axis=1))
print(df.apply(row_element(1), axis=1))
A B
0 NaN NaN
1 [1, 2] [3, 4]
2 [11, 22] [33, 44]
A B
0 NaN NaN
1 1.0 3.0
2 11.0 33.0
A B
0 NaN NaN
1 2.0 4.0
2 22.0 44.0
结果:
def row_element(elem_num):
def func(row):
ret = []
for item in row:
try:
ret.append(item[elem_num])
except:
ret.append(item)
return ret
return func
df = pd.DataFrame(
[[np.nan, np.nan], [[1, 2], [3, 4]], [[11, 22], [33, 44]]],
columns=['A', 'B'])
print(df)
print(df.apply(row_element(0), axis=1))
print(df.apply(row_element(1), axis=1))
A B
0 NaN NaN
1 [1, 2] [3, 4]
2 [11, 22] [33, 44]
A B
0 NaN NaN
1 1.0 3.0
2 11.0 33.0
A B
0 NaN NaN
1 2.0 4.0
2 22.0 44.0
您可以使用:
#replace NaN to [] - a bit hack
df = df.mask(df.isnull(), pd.Series([[]] * len(df.columns), index=df.columns), axis=1)
print (df)
A B
0 [] []
1 [1, 2] [3, 4]
2 [11, 22] [33, 44]
#create new df by each column, concanecate together
df3 = pd.concat([pd.DataFrame(df[col].values.tolist()) for col in df],
axis=1,
keys=df.columns)
print (df3)
A B
0 1 0 1
0 NaN NaN NaN NaN
1 1.0 2.0 3.0 4.0
2 11.0 22.0 33.0 44.0
#select by xs
df1 = df3.xs(0, level=1, axis=1)
print (df1)
A B
0 NaN NaN
1 1.0 3.0
2 11.0 33.0
df2 = df3.xs(1, level=1, axis=1)
print (df2)
A B
0 NaN NaN
1 2.0 4.0
2 22.0 44.0