Python 循环表的重新格式化
我有一个看起来像这样的数据帧:Python 循环表的重新格式化,python,pandas,Python,Pandas,我有一个看起来像这样的数据帧: ID Covid_pos Asymptomatic Fever Cough ... 0 1 0 1 0 1 0 0 0 1 2 1 1 0 1 3 1 0 1 0 4 0
ID Covid_pos Asymptomatic Fever Cough ...
0 1 0 1 0
1 0 0 0 1
2 1 1 0 1
3 1 0 1 0
4 0 1 1 0
5 1 0 1 0
6 0 1 1 0
7 1 0 0 1
8 0 0 0 0
9 0 0 0 0
我编写了一个for循环,它为每个症状变量和结果变量“COVID_POS”生成以下输出
exposure=['Cough',"Nasal_Congestion","Wheezing_Asthma","Abdominal_Pain","Diarrhea","Vomiting","Rash","Fever","MED_ALERT_CPR_SHOCK_SEPSIS","Lymph_Node_Neck","Ear","Mouth Sores","Eye","SOB_WOB_Hyp_Desat","PNA","Nausea","Weak_Fatigue","Bodyaches","Dizziness","Fussy","Poor_PO_Dehydration","Tachycardia","COVID Exposure","COVID Test","COVID PUI" ,"COVID MIS","COVID Kawasaki","CP","ST","HA","Loss_Taste_Smell"]
for symptom in exposure:
CTab=pd.crosstab(LABS_TAT[symptom],LABS_TAT.Covid_pos)
Odds = sm.stats.Table2x2(CTab)
print(Odds.summary())
OUTPUT:
问题陈述:输出是正确的,但我想重新格式化它,使其看起来像下表:
Symptom Odds Ratio LCB UCB
Cough 2.607 1.981 3.430
Nasal_Congestion 1.899 1.226 2.941
Wheezing_Asthma 0.739 0.373 1.46
...
从中可以看出,摘要的第一行是由oddsratio\u confint
、oddsratio\u pvalue
方法和oddsratio
属性构成的。构造一个字典以转换为数据帧
d = {'Symptom':[],'Odds Ratio':[],'LCB':[],'UCB':[]}
for symptom in exposure:
CTab=pd.crosstab(LABS_TAT[symptom],LABS_TAT.Covid_pos)
Odds = sm.stats.Table2x2(CTab)
d['Symptom'].append(symptom)
d['Odds Ratio'].append(Odds.oddsratio)
lcb,ucb = Odds.oddsratio_confint()
d['LCB'].append(lcb)
d['UCB'].append(ucb)
results = pd.DataFrame(d)
谢谢-只是一个小的代码调整:“'UCB':[]}”“@Raven我不禁想有一个更好的方法可以在没有for循环的情况下获得交叉表统计数据,但我真的对statsmodel一无所知。