Python MinMaxScaler未正确缩放

Python MinMaxScaler未正确缩放,python,scikit-learn,keras,Python,Scikit Learn,Keras,我正在使用从Lynda.com下载的sklearn MinMaxScaler代码来缩放数据集,以获得预测代码。FeatureRanger应该是(0,1),但我在试用数据中注意到有些列大于1。我相信这是导致我的预测代码不正确的原因。有人能帮忙吗?下面是我正在使用的代码 import pandas as pd from sklearn.preproMinmaxcessing import MinMaxScaler # Load training data set from CSV file tra

我正在使用从Lynda.com下载的sklearn MinMaxScaler代码来缩放数据集,以获得预测代码。FeatureRanger应该是(0,1),但我在试用数据中注意到有些列大于1。我相信这是导致我的预测代码不正确的原因。有人能帮忙吗?下面是我正在使用的代码

import pandas as pd
from sklearn.preproMinmaxcessing import MinMaxScaler

# Load training data set from CSV file
training_data_df = pd.read_csv("10596_data_training.csv")

# Load testing data set from CSV file
test_data_df = pd.read_csv("10596_data_test.csv")

# Load the trial data set from CSV file
trial_data_df = pd.read_csv("day05.csv")

# Data needs to be scaled to a small range like 0 to 1 for the neural
# network to work well.
scaler = MinMaxScaler(feature_range=(0, 1))

# Scale both the training inputs and outputs
scaled_training = scaler.fit_transform(training_data_df)
scaled_testing = scaler.transform(test_data_df)
scaled_trial = scaler.transform(trial_data_df)

# Print out the adjustment that the scaler applied to the total_earnings column of data
print("Note: total_hours values were scaled by multiplying by {:.10f} and     adding {:.6f}".format(scaler.scale_[40], scaler.min_[40]))

# Create new pandas DataFrame objects from the scaled data
scaled_training_df = pd.DataFrame(scaled_training,      columns=training_data_df.columns.values)
scaled_testing_df = pd.DataFrame(scaled_testing,     columns=test_data_df.columns.values)
scaled_trial_df = pd.DataFrame(scaled_trial, columns=trial_data_df.columns.values)

# Save scaled data dataframes to new CSV files
scaled_training_df.to_csv("10596_data_training_scaled.csv", index=False)
scaled_testing_df.to_csv("10596_data_test_scaled.csv", index=False)
scaled_trial_df.to_csv("day05_scaled.csv", index=False)

您正在数据子集上“训练”MinMaxScaler,然后转换不同的子集。MinMaxScaler只需减去训练集的最小值,然后除以最大值。如果试验集的值大于训练集的最大值或小于训练集的最小值,则值将超出[0,1]范围。这是预期的且可接受的。

要将数据从0重新缩放到1范围,0值是否可接受?例如:prediction=prediction+0.00000,prediction=prediction/0.0045372051当然-您首先进行缩放的唯一原因是因为许多ML算法不能很好地处理不同的缩放。一般来说,零值数据点没有什么特别之处。