Pandas 我怎样才能从这里的数据中得到具有正确数字的表格?
我怎么会有这样的结局 从这类数据中 我所做的给了我一个错误的答案Pandas 我怎样才能从这里的数据中得到具有正确数字的表格?,pandas,Pandas,我怎么会有这样的结局 从这类数据中 我所做的给了我一个错误的答案 A A- B B+ B- C C+ C- D D+ D- E X TOT AGR 0 0 0 1 0 0 1 1 0 0 1 0 0 4 H/SC 0 0 1 0 0 1 0 0 0 0 2 0 0 4 CRE 0 0 0 1 0 0 0 1 1 1
A A- B B+ B- C C+ C- D D+ D- E X TOT
AGR 0 0 0 1 0 0 1 1 0 0 1 0 0 4
H/SC 0 0 1 0 0 1 0 0 0 0 2 0 0 4
CRE 0 0 0 1 0 0 0 1 1 1 0 0 0 4
GEO 0 0 1 0 0 0 1 0 0 0 2 0 0 4
CRE 0 0 0 0 0 0 0 0 0 0 2 0 0 2
我的代码
columns = ["A",'A-','B+',"B",'B-','C+',"C",'C-','D+',"D",'D-',"E",'X']
sub_lists = list(df1[column_list])
sub_lists = pd.Series(sub_lists)
#print(sub_lists)
agr = (pd.crosstab(sub_lists, df1['AGR'], margins=True, margins_name='TOTAL').iloc[:,:-1].reindex(columns, axis=1, fill_value=0).rename_axis(None))
cre = (pd.crosstab(sub_lists, df1['CRE'], margins=True, margins_name='TOTAL').iloc[:,:-1].reindex(columns, axis=1, fill_value=0).rename_axis(None))
geo = (pd.crosstab(sub_lists, df1['GEO'], margins=True, margins_name='TOTAL').iloc[:,:-1].reindex(columns, axis=1, fill_value=0).rename_axis(None))
hsc = (pd.crosstab(sub_lists, df1['H/SC'], margins=True, margins_name='TOTAL').iloc[:,:-1].reindex(columns, axis=1, fill_value=0).rename_axis(None))
bst = (pd.crosstab(sub_lists, df1['BST'], margins=True, margins_name='TOTAL').iloc[:,:-1].reindex(columns, axis=1, fill_value=0).rename_axis(None))
c = pd.concat([agr, cre,geo,hsc,bst], axis=1, join='inner')
d = c.groupby(lambda x:x, axis=1).sum()
d['TOT'] = d[columns].sum(axis=1)
STREAM ADM NAME KCPE GEO CRE H/SC AGR BST
EAGLE 231 MITCHEL 279 D D+ E D- B
EAGLE 322 BEATRICE 268 E C- E D+ C
HAWK 654 BERYL 344 A C- E A- C
EAGLE 335 SOFI 266 E C- E D C
HAWK 321 LOICE 319 E D+ C- B A-
HAWK 234 BETTY 284 E D-您的解决方案被更改为2列
数据帧,然后与更改的iloc
一起用于。iloc[:-1]
用于删除最后一行,而不是最后一列,用于添加缺少的类别,最后用于删除索引和列名称(变量,值):
输入数据可能是文本版本吗?用文本编辑您使它看起来很简单。你是创建熊猫的人之一吗?我可以把变量和值从表中删除吗?@Ptar-不,熊猫开发者显然很忙,没有时间回答。
columns = ["A",'A-','B+',"B",'B-','C+',"C",'C-','D+',"D",'D-',"E",'X']
cols = ['AGR','CRE','GEO','H/SC','BST']
df1 = df[cols].melt()
df2 = (pd.crosstab(df1['variable'], df1['value'], margins=True, margins_name='TOTAL')
.iloc[:-1]
.reindex(columns + ['TOTAL'], fill_value=0, axis=1)
.rename_axis(index=None, columns=None))
print (df2)
A A- B+ B B- C+ C C- D+ D D- E X TOTAL
AGR 0 1 0 1 0 0 0 0 1 2 1 0 0 6
BST 0 2 0 1 0 0 3 0 0 0 0 0 0 6
CRE 0 0 0 0 0 0 0 3 2 0 1 0 0 6
GEO 1 0 0 0 0 0 0 0 0 1 0 4 0 6
H/SC 0 0 0 0 0 0 0 1 0 0 0 5 0 6