Python 如何将分组结果转换为数据帧
我有以下数据框:Python 如何将分组结果转换为数据帧,python,pandas,Python,Pandas,我有以下数据框: import pandas as pd import numpy as np df = pd.DataFrame({ 'category': ['ctr','ctr','ctr','ctr','ctr','ctr'], 'expected_count': [100,100,112,1.3,14,125], 'sample_id': ['S1','S1','S1','S2','S2','S2
import pandas as pd
import numpy as np
df = pd.DataFrame({
'category': ['ctr','ctr','ctr','ctr','ctr','ctr'],
'expected_count': [100,100,112,1.3,14,125],
'sample_id': ['S1','S1','S1','S2','S2','S2'],
'gene_symbol': ['a', 'b', 'c', 'a', 'b', 'c'],
})
这就产生了:
In [2]: df
Out[2]:
category expected_count gene_symbol sample_id
0 ctr 100.0 a S1
1 ctr 100.0 b S1
2 ctr 112.0 c S1
3 ctr 1.3 a S2
4 ctr 14.0 b S2
5 ctr 125.0 c S2
我可以用基因符号将其分组:
In [4]: gdf = df.groupby(by = 'gene_symbol')['expected_count'].mean()
...: gdf
...:
Out[4]:
gene_symbol
a 50.65
b 57.00
c 118.50
Name: expected_count, dtype: float64
In [5]: str(gdf)
Out[5]: 'gene_symbol\na 50.65\nb 57.00\nc 118.50\nName: expected_count, dtype: float64'
请注意,
gdf
是一个字符串。如何将其转换为数据帧?需要作为_index=False
或:
或:
输出不是字符串
,而是系列
:
print (type(df.groupby('gene_symbol')['expected_count'].mean()))
<class 'pandas.core.series.Series'>
print(类型(df.groupby('gene_symbol')['expected_count'].mean())
您可以使用:
gdf = df.groupby(by = 'gene_symbol')['expected_count'].mean().to_frame()
gdf
Out[149]:
expected_count
gene_symbol
a 50.65
b 57.00
c 118.50
print (type(df.groupby('gene_symbol')['expected_count'].mean()))
<class 'pandas.core.series.Series'>
gdf = df.groupby(by = 'gene_symbol')['expected_count'].mean().to_frame()
gdf
Out[149]:
expected_count
gene_symbol
a 50.65
b 57.00
c 118.50