Python 替换大数据帧中大于1的值_Python_Pandas_Dataframe

Python 替换大数据帧中大于1的值

python pandas dataframe

Python 替换大数据帧中大于1的值,python,pandas,dataframe,Python,Pandas,Dataframe,我试图用1替换所有大于1的数字，同时在整个数据帧中保持原来的1和0不变，只需付出最小的努力。感谢您的支持我的dataframe看起来像这样，但包含更多的列和行 Report No Apple Orange Lemon Grape Pear One 5 0 2 1 1 Two 1 1 0 3 2 Three 0 0

我试图用1替换所有大于1的数字，同时在整个数据帧中保持原来的1和0不变，只需付出最小的努力。感谢您的支持

我的dataframe看起来像这样，但包含更多的列和行

Report No   Apple   Orange   Lemon   Grape   Pear
One           5       0        2       1      1
Two           1       1        0       3      2
Three         0       0        2       1      3
Four          1       1        3       0      0
Five          4       0        0       1      1
Six           1       3        1       2      0

期望输出：

Report No   Apple   Orange   Lemon   Grape   Pear
One           1       0        1       1      1
Two           1       1        0       1      1
Three         0       0        1       1      1
Four          1       1        1       0      0
Five          1       0        0       1      1
Six           1       1        1       1      0

使用：

编辑：按名称排除第一列（这将在位编辑数据帧）

你可以试试这个

df.set_index('Report No',inplace=True)
df[df>1]=1
df.reset_index()

Report No   Apple   Orange   Lemon   Grape   Pear
One           1       0        1       1      1
Two           1       1        0       1      1
Three         0       0        1       1      1
Four          1       1        1       0      0
Five          1       0        0       1      1
Six           1       1        1       1      0

如果您有一些非数字列，也可以使用此选项。无需使用

设置索引

和

重置索引

。这相当于

或使用

或者，在处理多种情况时，使用此选项可能会有所帮助。如果要将小于0的值转换为0，以及将大于1的值转换为1

df.set_index('Report No',inplace=True)
condlist = [df>=1,df<=0] #you can have more conditions and add choices accordingly.
choice = [1,0] #len(condlist) should be equal to len(choice).
df.loc[:] = np.select(condlist,choice)

注意：这只会将falsy值转换为

False

并将truthy值转换为

True

，即将

转换为

False

，而

以外的任何值都是

True

偶数

s = pd.Series([1,-1,0])
s.astype('bool')
0     True
1     True
2    False
dtype: bool

s.astype('bool').astype('int')
0    1
1    1
2    0
dtype: int32

最快、最简单的方法是遍历datframe的所有键，并使用numpy（必须导入的库）的where函数更改它们。然后，我们只需将条件和条件满足与否时的值作为属性传递给该函数。在您的示例中，它将如下所示：

for x in df.keys()[1:]:
   df[x] = np.where(df[x] > 1, 1, df[x])

注意，在循环中，我退出了第一个键，因为它的值不是整数

new_df=df。clip（0，1）工作不好，因为“Report No”列是string。是否有任何方法可以隔离“报告编号”并执行pandas.DataFrame.clip？请参阅另一个答案。为什么不推荐最后一种方法？

df.set_index('Report No',inplace=True)
df.mask(df>1,1).reset_index()
Report No   Apple   Orange   Lemon   Grape   Pear
One           1       0        1       1      1
Two           1       1        0       1      1
Three         0       0        1       1      1
Four          1       1        1       0      0
Five          1       0        0       1      1
Six           1       1        1       1      0

df[df.columns[1:]] = df.iloc[:,1:].where(df.iloc[:,1:] >1 ,1)

df.set_index('Report No',inplace=True)
condlist = [df>=1,df<=0] #you can have more conditions and add choices accordingly.
choice = [1,0] #len(condlist) should be equal to len(choice).
df.loc[:] = np.select(condlist,choice)

df.set_index('Report No',inplace=True)
df.astype('bool').astype('int')

s = pd.Series([1,-1,0])
s.astype('bool')
0     True
1     True
2    False
dtype: bool

s.astype('bool').astype('int')
0    1
1    1
2    0
dtype: int32

for x in df.keys()[1:]:
   df[x] = np.where(df[x] > 1, 1, df[x])