Python df_的索引器有效
我正在使用python 3.6.8 我使用循环将某些列中的值转换为int:Python df_的索引器有效,python,python-3.x,pandas,scikit-learn,numpy-ndarray,Python,Python 3.x,Pandas,Scikit Learn,Numpy Ndarray,我正在使用python 3.6.8 我使用循环将某些列中的值转换为int: for i in cols: df_valid[[i]] = df_valid[[i]].astype(int) 其中显示了给定的错误 错误: 索引器错误:只有整数、片(`:`)、省略号(`…`)、numpy.newaxis(`None`)以及整数或布尔数组是有效的索引 如下面的完整代码所示,我在df_列车上使用了相同的方法。但是,它没有产生任何错误。我想这一定是因为 df\u valid=inputer.tr
for i in cols:
df_valid[[i]] = df_valid[[i]].astype(int)
其中显示了给定的错误
错误:
索引器错误:只有整数、片(`:`)、省略号(`…`)、numpy.newaxis(`None`)以及整数或布尔数组是有效的索引
如下面的完整代码所示,我在df_列车上使用了相同的方法。但是,它没有产生任何错误。我想这一定是因为
df\u valid=inputer.transform(df\u valid)
。但是,我无法解决它
您能否帮助并提供解决此错误的方向
我的完整代码如下所示:
import argparse
import os
import joblib
import pandas as pd
from sklearn.impute import KNNImputer
from sklearn import metrics
import config
import model_dispatcher
def run(fold, model):
df = pd.read_csv(config.TRAINING_FILE)
df["Gender"] = df["Gender"].map({"Male": 1, "Female": 0})
df["Married"] = df["Married"].map({"No": 0, "Yes": 1})
df["Self_Employed"] = df["Self_Employed"].map({"No": 0, "Yes": 1})
df["Dependents"] = df["Dependents"].map({"0": 0, "1": 1, "2": 2, "3+": 3})
df["Education"] = df["Education"].map({"Graduate": 1, "Not Graduate": 0})
df["Loan_Status"] = df["Loan_Status"].map({"N": 0, "Y": 1})
cols = ["Gender",
"Married",
"Dependents",
"Education",
"Self_Employed",
"Credit_History",
"Loan_Status"]
dummy = pd.get_dummies(df["Property_Area"])
df = pd.concat([df, dummy], axis=1)
df = df.drop(["Loan_ID", "Property_Area"], axis=1)
df_train = df[df.kfold != fold].reset_index(drop=True)
df_valid = df[df.kfold == fold].reset_index(drop=True)
imputer = KNNImputer(n_neighbors=18)
df_train = pd.DataFrame(imputer.fit_transform(df_train),
columns=df_train.columns)
for i in cols:
df_train[[i]] = df_train[[i]].astype(int)
df_valid = imputer.transform(df_valid)
for i in cols:
df_valid[[i]] = df_valid[[i]].astype(int)
df_train['GxM'] = df_train.apply(lambda row:
(row['Gender']*row['Married']),
axis=1)
df_train['Income_sum'] = (
df_train.apply(lambda row:
(row['ApplicantIncome'] +
row['CoapplicantIncome']),
axis=1))
df_train['DxE'] = df_train.apply(lambda row: (row['Education'] *
row['Dependents']),
axis=1)
df_train['DxExG'] = (
df_train.apply(lambda row:
(row['Education'] *
row['Dependents'] *
row['Gender']),
axis=1))
df_valid['GxM'] = df_valid.apply(lambda row:
(row['Gender']*row['Married']),
axis=1)
df_valid['Income_sum'] = (
df_valid.apply(lambda row:
(row['ApplicantIncome'] +
row['CoapplicantIncome']),
axis=1))
df_valid['DxE'] = df_valid.apply(lambda row: (row['Education'] *
row['Dependents']),
axis=1)
df_valid['DxExG'] = (
df_valid.apply(lambda row:
(row['Education'] *
row['Dependents'] *
row['Gender']),
axis=1))
X_train = df_train.drop("Loan_Status", axis=1).values
y_train = df_train.Loan_Status.values
X_valid = df_valid.drop("Loan_Status", axis=1).values
y_valid = df_valid.Loan_Status.values
clf = model_dispatcher.models[model]
clf.fit(X_train, y_train)
preds = clf.predict(X_valid)
rascore = metrics.roc_auc_score(y_valid, preds)
print(f"Fold = {fold}, ROC-AUC = {rascore}")
joblib.dump(
clf,
os.path.join(config.MODEL_OUTPUT, f"dt_{fold}.bin")
)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--fold", type=int)
parser.add_argument("--model", type=str)
args = parser.parse_args()
run (fold=args.fold, model=args.model)
要将所有列转换为整数格式,只需给出:
df_valid.apply(pd.to_numeric).dtypes
有关pd.to_数值的更多详细信息,请参阅
您可能还想在此阅读更多关于将数据转换为不同数据类型的信息我刚刚整理了一些代码。请看看是否可以。附言:我没有投反对票。我想这里有很多代码。你可能只想分享一段相关的代码,这样我们就可以清楚地知道我们需要寻找什么。如果你有太多的噪音(太多的代码),它会分散你对主要问题的注意力。我已经澄清了这个问题。看看它看起来是否正常。你是想写
df\u valid[i]
而不是df\u valid[[i]]
。另外,如果您想将所有列转换为整数,您不必像这样循环它们。您可以给df\u valid.apply(pd.to\u numeric.dtypes)
将所有列转换为整数datatype@JoeFerndz谢谢你的帮助。pd.to_数字作品!:)