Pandas 迭代每一行并比较数据帧的列值_Pandas_Dataframe

Pandas 迭代每一行并比较数据帧的列值

pandas dataframe

Pandas 迭代每一行并比较数据帧的列值,pandas,dataframe,Pandas,Dataframe,我有以下数据帧。如果值>=截止列表中存在的值，我想迭代每一行并比较score列 seq score status 7 TTGTTCTCTGTGTATTTCAGGCT 10.42 positive 56 CAGGTGAGA 9.22 positive 64 AATTCCTGTGGACTTTCAAGTAT 1.23 positive 116 AAGGTAT

我有以下数据帧。如果值>=截止列表中存在的值，我想迭代每一行并比较score列

                         seq  score    status
7    TTGTTCTCTGTGTATTTCAGGCT  10.42  positive
56                 CAGGTGAGA   9.22  positive
64   AATTCCTGTGGACTTTCAAGTAT   1.23  positive
116                AAGGTATAT   7.84  positive
145                AAGGTAATA   8.49  positive
172                TGGGTAGGT   6.86  positive
204                CAGGTAGAG   7.10  positive
214  GCGTTTCTTGAATCCAGCAGGGA   3.58  positive
269                GAGGTAATG   8.73  positive
274  CACCCATTCCTGTACCTTAGGTA   8.96  positive
325                GCCGTAAGG   5.46  positive
356                GAGGTGAGG   8.41  positive

到目前为止，我尝试的代码是：

cutoff_list_pos = []
number_list_pos = []

cut_off = range(0, int(new_df['score'].max())+1)

for co in cut_off:
    for df in df_elements:
        val = (df['score']>=co).value_counts()
        cutoff_list_pos.append(co)
        number_list_pos.append(val)

所需输出为：

     cutoff  true  false
0          0            12.0            0
1          1            12.0            0
and so on..

如果分数>=截止值，则应将行指定为true，否则为false。

您可以使用

截止值列表\u pos

的by值中的参数

键

，然后将索引转换为列：

您可以使用

cutoff\u list\u pos

的by值中的参数

键

，然后通过以下方式将索引转换为列：

另一项实施：

res_df = pd.DataFrame(columns=['cutoff', 'true'])
for i in range(1,int(df['score'].max()+1)):
    temp_df = pd.DataFrame(data={'cutoff': i, 'true': (df['score']>=i).sum()}, index=[i])
    res_df = pd.concat([res_df, temp_df])

res_df
    cutoff true
1       1   12
2       2   11
3       3   11
4       4   10
5       5   10
6       6    9
7       7    8
8       8    6
9       9    2
10     10    1

另一项实施：

res_df = pd.DataFrame(columns=['cutoff', 'true'])
for i in range(1,int(df['score'].max()+1)):
    temp_df = pd.DataFrame(data={'cutoff': i, 'true': (df['score']>=i).sum()}, index=[i])
    res_df = pd.concat([res_df, temp_df])

res_df
    cutoff true
1       1   12
2       2   11
3       3   11
4       4   10
5       5   10
6       6    9
7       7    8
8       8    6
9       9    2
10     10    1

用另一个解决方案解决！！但是，如果用另一种解决方案来解决这个问题，效果会很好！！但这也很有效

res_df = pd.DataFrame(columns=['cutoff', 'true'])
for i in range(1,int(df['score'].max()+1)):
    temp_df = pd.DataFrame(data={'cutoff': i, 'true': (df['score']>=i).sum()}, index=[i])
    res_df = pd.concat([res_df, temp_df])

res_df
    cutoff true
1       1   12
2       2   11
3       3   11
4       4   10
5       5   10
6       6    9
7       7    8
8       8    6
9       9    2
10     10    1