Machine learning 具有用户输入的测试预测模型_Machine Learning_Scikit Learn_Prediction

Machine learning 具有用户输入的测试预测模型

machine-learning scikit-learn

Machine learning 具有用户输入的测试预测模型,machine-learning,scikit-learn,prediction,Machine Learning,Scikit Learn,Prediction,我是ML的初学者，但我正在做一个大学项目，我成功地训练了一个模型，但我不确定如何测试用户输入。我的项目是检查为一个人输入的数据是否为糖尿病数据CSV: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome 6 148 72 35 0 33.6 0.627 50 1 1 85 66 29 0 26.6 0.351

我是ML的初学者，但我正在做一个大学项目，我成功地训练了一个模型，但我不确定如何测试用户输入。我的项目是检查为一个人输入的数据是否为糖尿病

数据CSV:

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
6   148 72  35  0   33.6    0.627   50  1
1   85  66  29  0   26.6    0.351   31  0
8   183 64  0   0   23.3    0.672   32  1
1   89  66  23  94  28.1    0.167   21  0
0   137 40  35  168 43.1    2.288   33  1
5   116 74  0   0   25.6    0.201   30  0
3   78  50  32  88  31  0.248   26  1
10  115 0   0   0   35.3    0.134   29  0
2   197 70  45  543 30.5    0.158   53  1

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

predict_train_data = random_forest_model.predict(X_test)
from sklearn import metrics
print("Accuracy = {0:.3f}".format(metrics.accuracy_score(y_test, predict_train_data)))

print("Enter your own data to test the model:")
pregnancy = int(input("Enter Pregnancy:"))
glucose = int(input("Enter Glucose:"))
bloodpressure = int(input("Enter Blood Pressue:"))
skinthickness = int(input("Enter Skin Thickness:"))
insulin = int(input("Enter Insulin:"))
bmi = float(input("Enter BMI:"))
DiabetesPedigreeFunction = float(input("Enter DiabetesPedigreeFunction:"))
age = int(input("Enter Age:"))
userInput = [pregnancy, glucose, bloodpressure, skinthickness, insulin, bmi, 
DiabetesPedigreeFunction, age]

from sklearn.model_selection import train_test_split
feature_columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']
predicted_class = ['Outcome']

X = data[feature_columns].values
y = data[predicted_class].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30, random_state=10)

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

代码：

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
6   148 72  35  0   33.6    0.627   50  1
1   85  66  29  0   26.6    0.351   31  0
8   183 64  0   0   23.3    0.672   32  1
1   89  66  23  94  28.1    0.167   21  0
0   137 40  35  168 43.1    2.288   33  1
5   116 74  0   0   25.6    0.201   30  0
3   78  50  32  88  31  0.248   26  1
10  115 0   0   0   35.3    0.134   29  0
2   197 70  45  543 30.5    0.158   53  1

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

predict_train_data = random_forest_model.predict(X_test)
from sklearn import metrics
print("Accuracy = {0:.3f}".format(metrics.accuracy_score(y_test, predict_train_data)))

print("Enter your own data to test the model:")
pregnancy = int(input("Enter Pregnancy:"))
glucose = int(input("Enter Glucose:"))
bloodpressure = int(input("Enter Blood Pressue:"))
skinthickness = int(input("Enter Skin Thickness:"))
insulin = int(input("Enter Insulin:"))
bmi = float(input("Enter BMI:"))
DiabetesPedigreeFunction = float(input("Enter DiabetesPedigreeFunction:"))
age = int(input("Enter Age:"))
userInput = [pregnancy, glucose, bloodpressure, skinthickness, insulin, bmi, 
DiabetesPedigreeFunction, age]

from sklearn.model_selection import train_test_split
feature_columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']
predicted_class = ['Outcome']

X = data[feature_columns].values
y = data[predicted_class].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30, random_state=10)

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

用户输入代码：

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
6   148 72  35  0   33.6    0.627   50  1
1   85  66  29  0   26.6    0.351   31  0
8   183 64  0   0   23.3    0.672   32  1
1   89  66  23  94  28.1    0.167   21  0
0   137 40  35  168 43.1    2.288   33  1
5   116 74  0   0   25.6    0.201   30  0
3   78  50  32  88  31  0.248   26  1
10  115 0   0   0   35.3    0.134   29  0
2   197 70  45  543 30.5    0.158   53  1

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

predict_train_data = random_forest_model.predict(X_test)
from sklearn import metrics
print("Accuracy = {0:.3f}".format(metrics.accuracy_score(y_test, predict_train_data)))

print("Enter your own data to test the model:")
pregnancy = int(input("Enter Pregnancy:"))
glucose = int(input("Enter Glucose:"))
bloodpressure = int(input("Enter Blood Pressue:"))
skinthickness = int(input("Enter Skin Thickness:"))
insulin = int(input("Enter Insulin:"))
bmi = float(input("Enter BMI:"))
DiabetesPedigreeFunction = float(input("Enter DiabetesPedigreeFunction:"))
age = int(input("Enter Age:"))
userInput = [pregnancy, glucose, bloodpressure, skinthickness, insulin, bmi, 
DiabetesPedigreeFunction, age]

from sklearn.model_selection import train_test_split
feature_columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']
predicted_class = ['Outcome']

X = data[feature_columns].values
y = data[predicted_class].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30, random_state=10)

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

我希望它返回1-如果是糖尿病或0-如果是非糖尿病

编辑-添加x\U序列和y\U序列：

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
6   148 72  35  0   33.6    0.627   50  1
1   85  66  29  0   26.6    0.351   31  0
8   183 64  0   0   23.3    0.672   32  1
1   89  66  23  94  28.1    0.167   21  0
0   137 40  35  168 43.1    2.288   33  1
5   116 74  0   0   25.6    0.201   30  0
3   78  50  32  88  31  0.248   26  1
10  115 0   0   0   35.3    0.134   29  0
2   197 70  45  543 30.5    0.158   53  1

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

predict_train_data = random_forest_model.predict(X_test)
from sklearn import metrics
print("Accuracy = {0:.3f}".format(metrics.accuracy_score(y_test, predict_train_data)))

print("Enter your own data to test the model:")
pregnancy = int(input("Enter Pregnancy:"))
glucose = int(input("Enter Glucose:"))
bloodpressure = int(input("Enter Blood Pressue:"))
skinthickness = int(input("Enter Skin Thickness:"))
insulin = int(input("Enter Insulin:"))
bmi = float(input("Enter BMI:"))
DiabetesPedigreeFunction = float(input("Enter DiabetesPedigreeFunction:"))
age = int(input("Enter Age:"))
userInput = [pregnancy, glucose, bloodpressure, skinthickness, insulin, bmi, 
DiabetesPedigreeFunction, age]

from sklearn.model_selection import train_test_split
feature_columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']
predicted_class = ['Outcome']

X = data[feature_columns].values
y = data[predicted_class].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30, random_state=10)

from sklearn.ensemble import RandomForestClassifier
random_forest_model = RandomForestClassifier(random_state=10)
random_forest_model.fit(X_train, y_train.ravel())

试一试

因为模型需要多个输入（2D数组）并返回每个元素的预测（观察列表）。

试试看

因为模型需要多个输入（2D数组）并返回每个元素的预测（观察列表）