Python 有条件地增加某些行值-大熊猫_Python_Pandas_Math_Conditional Statements

Python 有条件地增加某些行值-大熊猫

python pandas math

Python 有条件地增加某些行值-大熊猫,python,pandas,math,conditional-statements,Python,Pandas,Math,Conditional Statements,我有一个数据集，其中包含一列值，但是该列中的某些行包含异常值（如-9999999或9999999），这是由于我希望在Pandas中尝试更正的系统错误造成的原始数据集如下所示： Value Column -2092.925951 910.9736 -910.9736 -2024.96475 -2024.96475 999947.438 - (outlier) 67.4672 -999993.313 - (outlier) 9.8603 49.5318 17.5591 我只想将1000000添加

我有一个数据集，其中包含一列值，但是该列中的某些行包含异常值（如-9999999或9999999），这是由于我希望在Pandas中尝试更正的系统错误造成的

原始数据集如下所示：

Value Column
-2092.925951
910.9736
-910.9736
-2024.96475
-2024.96475
999947.438 - (outlier)
67.4672
-999993.313 - (outlier)
9.8603
49.5318
17.5591

我只想将1000000添加到数字介于-800000和-999999之间的行，并从数字介于800000和999999之间的行中减去1000000

所需数据集的示例如下：

Value Column
-2092.925951
910.9736
-910.9736
-2024.96475
-2024.96475
-52.562 - (fixed outlier with 999,947.438 - 1,000,000)
67.4672
6.687 - (fixed outlier with -999,993.313 + 1,000,000)
9.8603
49.5318
17.5591

任何帮助或想法都将不胜感激

将

值列

视为

VC

(
    df.assign(l=df['Value Column'].between(800000,999999)*-1000000)
    .assign(s=df['Value Column'].between(-999999,-800000)*1000000)
    .apply('sum', axis=1)
)

0    -2092.925951
1      910.973600
2     -910.973600
3    -2024.964750
4    -2024.964750
5      -52.562000
6       67.467200
7        6.687000
8        9.860300
9       49.531800
10      17.559100
dtype: float64

In [8]: l = [-2092.925951,910.9736,-910.9736,-2024.96475,-2024.96475,
   ...: 999947.438,67.4672,-999993.313,9.8603,49.5318,17.5591,]

In [9]: df = pd.DataFrame.from_dict({'VC':l})

In [10]: def check(value):
    ...:     if value > 10000:
    ...:         return value-1000000
    ...:     elif value < -10000:
    ...:         return -1000000-value
    ...:     return value
    ...: 
    ...: df['VC'] = df.apply(lambda row: check(row['VC']), axis=1)
    ...: 

In [11]: df
Out[11]: 
             VC
0  -2092.925951
1    910.973600
2   -910.973600
3  -2024.964750
4  -2024.964750
5    -52.562000
6     67.467200
7     -6.687000
8      9.860300
9     49.531800
10    17.559100

[8]中的

：l=[-2092.925951910.9736，-910.9736，-2024.96475，--2024.96475，
...: 999947.438,67.4672,-999993.313,9.8603,49.5318,17.5591,]
[9]中：df=pd.DataFrame.from_dict（{'VC'：l}）
在[10]：def检查（值）：
…：如果值>10000：
…：返回值-1000000
…：elif值<-10000：
…：返回-1000000值
…：返回值
...: 
…：df['VC']=df.apply（lambda行：检查（行['VC']），轴=1）
...: 
In[11]：df
出[11]：
风险投资
0  -2092.925951
1    910.973600
2   -910.973600
3  -2024.964750
4  -2024.964750
5    -52.562000
6     67.467200
7     -6.687000
8      9.860300
9     49.531800
10    17.559100

将

值列

视为

VC

In [8]: l = [-2092.925951,910.9736,-910.9736,-2024.96475,-2024.96475,
   ...: 999947.438,67.4672,-999993.313,9.8603,49.5318,17.5591,]

In [9]: df = pd.DataFrame.from_dict({'VC':l})

In [10]: def check(value):
    ...:     if value > 10000:
    ...:         return value-1000000
    ...:     elif value < -10000:
    ...:         return -1000000-value
    ...:     return value
    ...: 
    ...: df['VC'] = df.apply(lambda row: check(row['VC']), axis=1)
    ...: 

In [11]: df
Out[11]: 
             VC
0  -2092.925951
1    910.973600
2   -910.973600
3  -2024.964750
4  -2024.964750
5    -52.562000
6     67.467200
7     -6.687000
8      9.860300
9     49.531800
10    17.559100

[8]中的

：l=[-2092.925951910.9736，-910.9736，-2024.96475，--2024.96475，
...: 999947.438,67.4672,-999993.313,9.8603,49.5318,17.5591,]
[9]中：df=pd.DataFrame.from_dict（{'VC'：l}）
在[10]：def检查（值）：
…：如果值>10000：
…：返回值-1000000
…：elif值<-10000：
…：返回-1000000值
…：返回值
...: 
…：df['VC']=df.apply（lambda行：检查（行['VC']），轴=1）
...: 
In[11]：df
出[11]：
风险投资
0  -2092.925951
1    910.973600
2   -910.973600
3  -2024.964750
4  -2024.964750
5    -52.562000
6     67.467200
7     -6.687000
8      9.860300
9     49.531800
10    17.559100

这看起来也是一个不错的选择，我会尝试一下，谢谢！这看起来也是个不错的选择，我会试试的，谢谢！