Python:跨类别应用函数并将结果保存到新列
我是Python新手。我在分析脑电图数据。我已经创建了函数Python:跨类别应用函数并将结果保存到新列,python,pandas,numpy,Python,Pandas,Numpy,我是Python新手。我在分析脑电图数据。我已经创建了函数extract_bands,用于计算EEG频带的值(基于),但我在跨类别应用该函数以及将聚合数据保存到新数据集中时遇到了问题 这是一个简化的数据集,pddf: 将熊猫作为pd导入 将numpy作为np导入 simple_df={'subject':['s1','s1','s1','s1','s1','s2','s2','s2','s2','s2','s3','s3','s3','s3','s3','s4','s4','s4','s4','
extract_bands
,用于计算EEG频带的值(基于),但我在跨类别应用该函数以及将聚合数据保存到新数据集中时遇到了问题
这是一个简化的数据集,pddf
:
将熊猫作为pd导入
将numpy作为np导入
simple_df={'subject':['s1','s1','s1','s1','s1','s2','s2','s2','s2','s2','s3','s3','s3','s3','s3','s4','s4','s4','s4','s4','s4'],
‘群’:[‘a’、‘a’、‘a’、‘a’、‘a’、‘a’、‘a’、‘a’、‘a’、‘a’、‘a’、‘c’、‘c’、‘c’、‘c’、‘c’、‘c’、‘c’、‘c’、‘c’、‘c’、‘c’,],
"审讯":[1,"1","2","4","2","4","2","2","4""2","2""2"","2,
‘cond’:[‘c1’、‘c1’、‘c1’、‘c2’、‘c2’、‘c1’、‘c2’、‘c2’、‘c2’、‘c2’、‘c2’、‘c1’、‘c1’、‘c2’、‘c2’、‘c1’、‘c1’、‘c2’、‘c2’、‘c2’],
‘价值’:[8.88260935、82.97797122、18.26659492、7.70070742、12.76417463、,
68.35936355, 7.59613253, 54.36616722, 9.11860667, 24.20324845,
86.1674253 , 99.96479613, 40.83798898, 23.72822971, 49.77969641,
2.19459866, 30.3883309 , 46.75944945, 11.47003917, 26.71771771,
88.93251086, 7.29166478, 7.76880683, 40.65701944]
}
pddf=pd.DataFrame(简单的_-df,列=['subject'、'group'、'trial'、'cond'、'value'])
下面是函数提取\u波段
:
#定义频率
fs=256
#定义EEG频带
eeg_带={'Delta':(0,4),
"θ":(4,8),
“阿尔法”:(8,12),
"贝塔":(12,30),
'伽马':(30,45)}
def提取带(数据):
fft_vals=np.绝对值(np.fft.rfft(数据))
fft_freq=np.fft.rfftfreq(len(数据),1.0/fs)
eeg\u波段\u fft=dict()
对于eeg_波段中的波段:
freq_ix=np.其中((fft_freq>=eeg_波段[band][0])&
(fft_freq受试者组试验条件值
#>0 s1 a 1 c1 8.882609
#>1 s1 a 1 c1 82.977971
提取带(一次试验值)
#>{'Delta':91.86058057,'Theta':nan,'Alpha':nan,'Beta':nan,'Gamma':nan}
问题
现在,对于每个受试者
,我如何在属于相同条件cond
的试验中应用功能extract_bands
基本上,我想返回一个数据集,其中每个cond
每个主题都有一行,总共有八列:“主题”、“组”、“cond”以及字典EEG\u band\u fft
中五个EEG频带的值
示例
下面的代码使用groupby
实现了我想要的功能(用于计算avarages),但我不知道如何使用函数extract\u bands
实现它
pddf2=pddf.groupby([“subject”,“group”,“cond”]).value.mean()#取平均值
pddf2
#>主题组条件
s1 a c1 29.456971
c2 40.561769
s2 a c1 30.981150
c2 54.863519
s3 c c1 32.280519
c2 32.283109
s4 c c1 48.112088
c2 21.653396
名称:value,数据类型:float64
由创建于2021-05-26,由如果要对数据帧执行自定义聚合,应使用函数agg
并指定自定义函数。然后应将dict列转换为数据帧,最后连接两个数据帧
我会这样做:
dfg = (pddf.groupby(["subject", "group", "cond"])
.agg({'value' : lambda x: extract_bands(x)})
.reset_index()
)
df_dict = pd.DataFrame.from_records(dfg['value'])
result = pd.concat([dfg.drop(columns=['value']), df_dict], axis=1)
此代码返回以下数据帧:
subject group cond Delta Theta Alpha Beta Gamma
0 s1 a c1 117.827883 NaN NaN NaN NaN
1 s1 a c2 81.123538 NaN NaN NaN NaN
2 s2 a c1 61.962300 NaN NaN NaN NaN
3 s2 a c2 219.454077 NaN NaN NaN NaN
4 s3 c c1 129.122075 NaN NaN NaN NaN
5 s3 c c2 64.566219 NaN NaN NaN NaN
6 s4 c c1 96.224176 NaN NaN NaN NaN
7 s4 c c2 86.613583 NaN NaN NaN NaN