Python 如何跨行获取特定字符串的计数?

Python 如何跨行获取特定字符串的计数?,python,pandas,Python,Pandas,我的数据框架如下: 我想将数据帧的新列中的D、T和N的计数作为Dcount TCount Ncount data = {'CHROM':['chr1', 'chr2', 'chr1', 'chr3', 'chr1','chr1', 'chr2', 'chr1'], 'POS':[939570,3411794,1043223,22511093,24454031,3411794,22511093,1043223], 'MI':['T', 'T', 'D', 'D',

我的数据框架如下: 我想将数据帧的新列中的D、T和N的计数作为Dcount TCount Ncount

data = {'CHROM':['chr1', 'chr2', 'chr1', 'chr3', 'chr1','chr1', 'chr2', 'chr1'],
        'POS':[939570,3411794,1043223,22511093,24454031,3411794,22511093,1043223],
        'MI':['T', 'T', 'D', 'D', 'T', 'N', 'D', 'N'],
        'CSK':['D', 'D', 'N', 'T', 'N', 'D', 'T', 'T'],
        'DD':['N', 'D', 'D', 'D', 'T', 'N', 'D', 'N'],
        'RR':['D', 'T', 'N', 'T', 'D', 'D', 'T', 'N'],
        'RCB':['D', 'D', 'D', 'D', 'D', 'D', 'D', 'D'],
        'DC':['D', 'D', 'T', 'D', 'D', 'D', 'N', 'D']
       }
df1 = pd.DataFrame(data)
df1

我想在新的数据帧中获得
T
D
N
的计数

预期产出:

    CHROM   POS      MI CSK DD  RR  RCB DC  Dcount  Tcount  Ncount
0   chr1    939570   T  D   N   D   D   D   4       1       1
1   chr2    3411794  T  D   D   T   D   D   4       2       0
2   chr1    1043223  D  N   D   N   D   T   3       1       2
3   chr3    22511093 D  T   D   T   D   D   4       2       0
4   chr1    24454031 T  N   T   D   D   D   3       2       1
5   chr1    3411794  N  D   N   D   D   D   4       0       2
6   chr2    22511093 D  T   D   T   D   N   3       2       1
7   chr1    1043223  N  T   N   N   D   D   2       1       3
用于选择从2到数据帧末尾的所有列,计数值按,将缺少的值重新拼凑到
0
,然后使用并附加到原始值按:

或与和一起使用:


不幸的是,这是一个骗局。请检查链接问题。它有你的两个答案。@MayankPorwal-部分是重复的,在链接的Q/A中,现在的问题是misisng,通过
iloc
过滤掉前两列,并通过
join
追加。这个问题已经存在。
    CHROM   POS      MI CSK DD  RR  RCB DC  Dcount  Tcount  Ncount
0   chr1    939570   T  D   N   D   D   D   4       1       1
1   chr2    3411794  T  D   D   T   D   D   4       2       0
2   chr1    1043223  D  N   D   N   D   T   3       1       2
3   chr3    22511093 D  T   D   T   D   D   4       2       0
4   chr1    24454031 T  N   T   D   D   D   3       2       1
5   chr1    3411794  N  D   N   D   D   D   4       0       2
6   chr2    22511093 D  T   D   T   D   N   3       2       1
7   chr1    1043223  N  T   N   N   D   D   2       1       3
df1 = (df1.join(df1.iloc[:, 2:]
                   .apply(pd.value_counts, axis=1)
                   .fillna(0)
                   .astype(int)
                   .add_suffix('count')))
print (df1)
  CHROM       POS MI CSK DD RR RCB DC  Dcount  Ncount  Tcount
0  chr1    939570  T   D  N  D   D  D       4       1       1
1  chr2   3411794  T   D  D  T   D  D       4       0       2
2  chr1   1043223  D   N  D  N   D  T       3       2       1
3  chr3  22511093  D   T  D  T   D  D       4       0       2
4  chr1  24454031  T   N  T  D   D  D       3       1       2
5  chr1   3411794  N   D  N  D   D  D       4       2       0
6  chr2  22511093  D   T  D  T   D  N       3       1       2
7  chr1   1043223  N   T  N  N   D  D       2       3       1
df1 = df1.join(df1.iloc[:, 2:]
                  .stack()
                  .groupby(level=0)
                  .value_counts()
                  .unstack(fill_value=0)
                  .add_suffix('count'))
print (df1)
  CHROM       POS MI CSK DD RR RCB DC  Dcount  Ncount  Tcount
0  chr1    939570  T   D  N  D   D  D       4       1       1
1  chr2   3411794  T   D  D  T   D  D       4       0       2
2  chr1   1043223  D   N  D  N   D  T       3       2       1
3  chr3  22511093  D   T  D  T   D  D       4       0       2
4  chr1  24454031  T   N  T  D   D  D       3       1       2
5  chr1   3411794  N   D  N  D   D  D       4       2       0
6  chr2  22511093  D   T  D  T   D  N       3       1       2
7  chr1   1043223  N   T  N  N   D  D       2       3       1