Python 孤立点检测和在完整数据帧中替换它们 def异常值(列,creditCardData): creditCardData[列]。描述() zscore=(creditCardData[列]- creditCardData[column].mean())/cre

Python 孤立点检测和在完整数据帧中替换它们 def异常值(列,creditCardData): creditCardData[列]。描述() zscore=(creditCardData[列]- creditCardData[column].mean())/cre,python,pandas,function,spyder,detection,Python,Pandas,Function,Spyder,Detection,孤立点检测和在完整数据帧中替换它们 def异常值(列,creditCardData): creditCardData[列]。描述() zscore=(creditCardData[列]- creditCardData[column].mean())/creditCardData[column].std() 无输出=总和(zscore>3) 打印('异常值的数量:',输出的数量) upper_f=creditCardData[列].mean()+3*creditCardData[列].std()

孤立点检测和在完整数据帧中替换它们
def异常值(列,creditCardData):
creditCardData[列]。描述()
zscore=(creditCardData[列]-
creditCardData[column].mean())/creditCardData[column].std()
无输出=总和(zscore>3)
打印('异常值的数量:',输出的数量)
upper_f=creditCardData[列].mean()+3*creditCardData[列].std()
lower_f=creditCardData[column].mean()-3*creditCardData[column].std()
no_of_out_up=总和(creditCardData[列]>上限)
no_of_out_lo=总和(creditCardData[列]上限_f]=上限_f
creditCardData[列][creditCardData[列]上部)
no_of_out_lo=sum(creditCardData[列]
def异常值(列名称,creditCardData):
平均值=信用卡数据[col_name].mean()
std=creditCardData[col_name].std()
上限=平均值+3*std
下限=平均值-3*std
打印('上限:',上限)
打印('下限:',下限'\n')
no_of_out_up=总和(creditCardData[col_name]>上限)
no_of_out_lo=总和(creditCardData[col_name]upper]=upper
creditCardData[列名称][creditCardData[列名称]上限)

无输出=总和(信用卡数据[列名称]你能提供一个你的数据和预期结果的例子吗?构建了一个函数,用于使用z分数方法检测特征中的异常值,并试图通过用上限替换异常值来解决问题。我无法实现此函数的输出。因此,你能在这方面帮助我吗?以及上限的z分数检测阈值下限为3和-3。这并没有解决您提供的输出。它没有解决什么问题?它应该有助于设置CopyWarning
。您需要发布代码、输出和预期输出。在您的情况下,
out\u ind
low\u ind
的值是多少?您甚至有异常值吗s?请研究如何询问有关堆栈溢出的问题,以便获得所需的帮助,包括和。输出已粘贴异常值(“一次性购买”,creditCardData)异常值数量:422删除异常值\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu空值:0
def outliers(column, creditCardData):

creditCardData[column].describe()


zscore = (creditCardData[column] -
creditCardData[column].mean())/creditCardData[column].std()
no_of_out = sum(zscore > 3)
print('No of outliers: ', no_of_out)

upper_f = creditCardData[column].mean() + 3*creditCardData[column].std()
lower_f = creditCardData[column].mean() - 3*creditCardData[column].std()

no_of_out_up = sum(creditCardData[column]>upper_f)
no_of_out_lo = sum(creditCardData[column]<lower_f)

print('Removing outliers____________')

creditCardData[column][creditCardData[column]>upper_f] = upper_f
creditCardData[column][creditCardData[column]<lower_f] = lower_f

no_of_out_up = sum(creditCardData[column]>upper)
no_of_out_lo = sum(creditCardData[column]<lower)

print('Null values: ', creditCardData[column].isnull().sum())


outliers('PURCHASES', creditCardData)




outliers('ONEOFF_PURCHASES',creditCardData)
No of outliers:  422
Removing outliers____________
Null values:  0
<ipython-input-137-83ef36d41cf4>:15: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame
def Outliers(col_name, creditCardData):

mean = creditCardData[col_name].mean()
std = creditCardData[col_name].std()

upper = mean + 3 * std
lower = mean - 3 * std

print('Upper bound: ', upper)
print('Lower bound: ', lower, '\n')

no_of_out_up = sum(creditCardData[col_name]>upper)
no_of_out_lo = sum(creditCardData[col_name]<lower)

print('No of outliers above upperbound: ', no_of_out_up)
print('No of outliers below lowerbound: ', no_of_out_lo, '\n')

print('Removing outliers____________\n')
creditCardData[col_name][creditCardData[col_name]>upper] = upper
creditCardData[col_name][creditCardData[col_name]<lower] = lower

no_of_out_up = sum(creditCardData[col_name]>upper)
no_of_out_lo = sum(creditCardData[col_name]<lower)

print('No of outliers above upperbound: ', no_of_out_up)
print('No of outliers below lowerbound: ', no_of_out_lo, '\n')