Python 使用np.load的ValueError
嗨,我有一个主数据集,根据一些条件从中提取训练和测试数据集,当我想导入训练数据集时,我得到了这个错误 “ValueError回溯(最近一次调用上次) 在() 11 12列车=pd.read\U csv(“/content/path(1.csv)) --->13 X_train=np.load(train+“/X_220_1020.npy”,allow_pickle=True,encoding='latin1') 14 Y_train=np.load(train+“/Y.npy”,allow_pickle=True,encoding='latin1') 15打印(“X_系列尺寸:”,X_系列形状) 10帧 /getType(a)中的usr/local/lib/python3.7/dist-packages/numexpr/necompiler.py 702返回字节 703如果种类='U': -->704 raise VALUERROR('NumExpr 2不支持Unicode作为数据类型') 705提升值错误(“未知类型%s”%a.dtype.name) 706 ValueError:NumExpr 2不支持Unicode作为数据类型..” 这是我的代码:Python 使用np.load的ValueError,python,pandas,machine-learning,Python,Pandas,Machine Learning,嗨,我有一个主数据集,根据一些条件从中提取训练和测试数据集,当我想导入训练数据集时,我得到了这个错误 “ValueError回溯(最近一次调用上次) 在() 11 12列车=pd.read\U csv(“/content/path(1.csv)) --->13 X_train=np.load(train+“/X_220_1020.npy”,allow_pickle=True,encoding='latin1') 14 Y_train=np.load(train+“/Y.npy”,allow_pi
import numpy as np
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import KFold, StratifiedKFold, train_test_split, GridSearchCV
from sklearn.metrics import precision_score, recall_score, accuracy_score, balanced_accuracy_score, f1_score, matthews_corrcoef, classification_report, make_scorer
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
from xlwt import Workbook
from sklearn.metrics import confusion_matrix
from sklearn.utils.multiclass import unique_labels
df = pd.read_csv("/content/CPM.csv",index_col=0) #read the main CPM dataset(67k × 719 )
v = df.transpose() # (719 × 67k)
# the path pf data sets that we want extract from main
training= pd.read_excel("/content/train.test.xlsx", sheet_name="train")
testing = pd.read_excel("/content/train.test.xlsx", sheet_name="test")
feature_space="/content/FEATURE_SPACES.xlsx"
#the list of feature space name
name=["FEATURE_SPACE1", "FEATURE_SPACE2","FEATURE_SPACE3(LIMMA)","FEATURE_SPACE4(LIMMA)", "FEATURE_SPACE5", "FEATURE_SPACE6", "FEATURE_SPACE7""FEATURE_SPACE9"]
#if we want to choose from feature spaces excel file
def extract2(path, sheet_name, dataset):
df1 = pd.read_excel(path, sheet_name=sheet_name) # path of subdatset
list_of= df1['isoform'].values.tolist() #exatrct the list of isoforms names as list
data = v[np.intersect1d(v.columns, list_of)] # find the mutual isoform between main datset and subdatset
data.reset_index(inplace=True)
data.rename(columns={ data.columns[0]: "sample_id" }, inplace = True)
x = dataset['sample_id'].values.tolist()
data1= data.loc[data['sample_id'].isin(x)]
b = data1.to_csv("path.csv", index=False) # save as csv file
c = extract2(feature_space, name[1], training) # for example wants to exatrct feature space1
train = pd.read_csv("/content/path.csv")
X_train = np.load(train + "/X_220_1020.npy", , allow_pickle=True)
Y_train = np.load(train + "/Y.npy" , allow_pickle=True)
print("X_train size:", X_train.shape)
请粘贴整个错误跟踪,包括出现错误的特定行。ValueError看起来像是从中引发的,但此处发布的代码不涉及
numexpr
。这似乎是一个错误,X_220_1020.npy
和Y.npy
中的数据是如何产生的。@jhso我更新了问题并添加了整个错误跟踪