Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/353.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/templates/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python “获取错误”;应为2D数组,但改为1D数组",拆分数据集(csv)以生成线性_Python_Jupyter Notebook - Fatal编程技术网

Python “获取错误”;应为2D数组,但改为1D数组",拆分数据集(csv)以生成线性

Python “获取错误”;应为2D数组,但改为1D数组",拆分数据集(csv)以生成线性,python,jupyter-notebook,Python,Jupyter Notebook,我已经像这样分割了数据集 X = [] y = [] # first, compute the number of samples in the training set: n_train = int(len(df) * 0.7) # The training set is the first n_train samples in the dataset X_train = df[: n_train] Y_train = df[: n_train] # INSERT YOUR CODE HER

我已经像这样分割了数据集

X = []
y = []
# first, compute the number of samples in the training set:
n_train = int(len(df) * 0.7)

# The training set is the first n_train samples in the dataset
X_train = df[: n_train]
Y_train = df[: n_train] # INSERT YOUR CODE HERE

# The test set is the remaining samples in the dataset
X_test = df[n_train:] 
Y_test = df[n_train:]

# Print the number of samples in the training set
print('The number of samples in the training set:')
# INSERT YOUR CODE HERE
print(len(Y_train))

# Print the number of samples in the test set
print('The number of samples in the test set:')
# INSERT YOUR CODE HERE
print(len(Y_test))
接下来,我创建了一个这样的线性模型

lr = linear_model.LinearRegression()
但当我尝试将我的列车数据与之匹配时

lr.fit(X_train, Y_train)
我得到这个错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-65-9d85ca185925> in <module>
      2 
      3 # INSERT YOUR CODE HERE
----> 4 lr.fit(X_train, Y_train)

~\Anaconda3\ana01\lib\site-packages\sklearn\linear_model\base.py in fit(self, X, y, sample_weight)
    456         n_jobs_ = self.n_jobs
    457         X, y = check_X_y(X, y, accept_sparse=['csr', 'csc', 'coo'],
--> 458                          y_numeric=True, multi_output=True)
    459 
    460         if sample_weight is not None and np.atleast_1d(sample_weight).ndim > 1:

~\Anaconda3\ana01\lib\site-packages\sklearn\utils\validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
    754                     ensure_min_features=ensure_min_features,
    755                     warn_on_dtype=warn_on_dtype,
--> 756                     estimator=estimator)
    757     if multi_output:
    758         y = check_array(y, 'csr', force_all_finite=True, ensure_2d=False,

~\Anaconda3\ana01\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    550                     "Reshape your data either using array.reshape(-1, 1) if "
    551                     "your data has a single feature or array.reshape(1, -1) "
--> 552                     "if it contains a single sample.".format(array))
    553 
    554         # in the future np.flexible dtypes will be handled like object dtypes

ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在里面
2.
3#在此处插入代码
---->4 lr.装配(X_系列、Y_系列)
~\Anaconda3\ana01\lib\site packages\sklearn\linear\u model\base.py适合(自身、X、y、样本重量)
456 n_作业=自我n_作业
457 X,y=检查X_y(X,y,接受稀疏=['csr','csc','coo'],
-->458 y_数值=真,多输出=真)
459
460如果样本重量不是无且np.至少1d(样本重量)。ndim>1:
~\Anaconda3\ana01\lib\site packages\sklearn\utils\validation.py in check\u X\u y(X,y,accept\u sparse,accept\u large\u sparse,dtype,order,copy,force\u all\u finite,sure\u 2d,allow\u nd,multi\u output,sure\u min\u samples,sure\u minu features,y\u numeric,warn\u on\u dtype,estimator)
754确保最小功能=确保最小功能,
755 warn_on_dtype=warn_on_dtype,
-->756估算器=估算器)
757如果多输出:
758 y=检查数组(y,'csr',强制所有有限=真,确保2d=假,
检查数组中的~\Anaconda3\ana01\lib\site packages\sklearn\utils\validation.py(数组、接受稀疏、接受大稀疏、数据类型、顺序、复制、强制所有有限、确保2d、允许nd、确保最小样本、确保最小特征、警告数据类型、估算器)
550“使用数组重塑您的数据。如果”
551“您的数据只有一个特征或数组。重塑(1,-1)”
-->552“如果它包含单个样本。”。格式(数组))
553
554#在未来,灵活的数据类型将像对象数据类型一样处理
ValueError:应为2D数组,而应为1D数组:
数组=[]。
使用数组重塑数据。如果数据具有单个特征或数组,则重塑(-1,1)。如果数据包含单个样本,则重塑(1,-1)。
数据集

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2938 entries, 0 to 2937
Data columns (total 22 columns):
Country                            2938 non-null object
Year                               2938 non-null int64
Status                             2938 non-null object
Life                               2938 non-null float64
Adult Mortality                    2938 non-null float64
infant deaths                      2938 non-null int64
Alcohol                            2938 non-null float64
percentage expenditure             2938 non-null float64
Hepatitis B                        2938 non-null float64
Measles                            2938 non-null int64
BMI                                2938 non-null float64
under-five deaths                  2938 non-null int64
Polio                              2938 non-null float64
Total expenditure                  2938 non-null float64
Diphtheria                         2938 non-null float64
HIV/AIDS                           2938 non-null float64
GDP                                2938 non-null float64
Population                         2938 non-null float64
thinness  1-19 years               2938 non-null float64
thinness 5-9 years                 2938 non-null float64
Income composition of resources    2938 non-null float64
Schooling                          2938 non-null float64
dtypes: float64(16), int64(4), object(2)
memory usage: 505.0+ KB
None

范围索引:2938个条目,0到2937
数据列(共22列):
国家/地区2938非空对象
年份2938非空int64
状态2938非空对象
Life 2938非空浮点64
成人死亡率2938非零64
婴儿死亡2938非空int64
酒精2938非零浮动64
支出百分比2938非空浮动64
乙型肝炎2938非零64
麻疹2938非空int64
BMI 2938非空浮点64
五岁以下死亡2938非空int64
脊髓灰质炎2938非空浮点64
支出总额2938非零浮动64
白喉2938非零型64
艾滋病毒/艾滋病2938非零艾滋病毒64
GDP 2938非零浮动64
人口2938非空浮点64
瘦1-19岁2938非零漂64
苗条5-9年2938非零漂64
资源收入构成2938非零浮动64
学校教育2938非空浮动64
数据类型:float64(16)、int64(4)、object(2)
内存使用率:505.0+KB
没有一个
样本数据集


请遵循下面给出的程序

import pandas as pd
df = pd.read_csv("example.csv")

X = df.drop('Target_variable' , axis = 1)
Y = df['Target_variable']

n_train = int(len(df) * 0.7)

# The training set is the first n_train samples in the dataset
X_train = X[: n_train]
Y_train = Y[: n_train] # INSERT YOUR CODE HERE

# The test set is the remaining samples in the dataset
X_test = df[n_train:] 
Y_test = df[n_train:]

# Print the number of samples in the training set
print('The number of samples in the training set:')
# INSERT YOUR CODE HERE
print(len(Y_train))

# Print the number of samples in the test set
print('The number of samples in the test set:')
# INSERT YOUR CODE HERE
print(len(Y_test))

from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train, Y_train)

请遵循下面给出的程序

import pandas as pd
df = pd.read_csv("example.csv")

X = df.drop('Target_variable' , axis = 1)
Y = df['Target_variable']

n_train = int(len(df) * 0.7)

# The training set is the first n_train samples in the dataset
X_train = X[: n_train]
Y_train = Y[: n_train] # INSERT YOUR CODE HERE

# The test set is the remaining samples in the dataset
X_test = df[n_train:] 
Y_test = df[n_train:]

# Print the number of samples in the training set
print('The number of samples in the training set:')
# INSERT YOUR CODE HERE
print(len(Y_train))

# Print the number of samples in the test set
print('The number of samples in the test set:')
# INSERT YOUR CODE HERE
print(len(Y_test))

from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train, Y_train)

请打印X_-train和y_-train数组的形状。当我执行“np.ma.shape(X_-train)”时,我得到“(0,)”而对于y_-train,我得到“(2056,)“你只有一个功能吗?请您将示例数据集excel或csv添加到您的问题中好吗?我现在已将我的数据集添加到我的帖子中。我已添加了示例数据集请打印X_列和y_列数组的形状。当我执行“np.ma.shape(X_列)”时,我得到了“(0,)”而y_列我得到了“(2056,)“您只有一个功能吗?您能将样本数据集excel或csv添加到您的问题中吗?我现在已将我的数据集添加到我的帖子中。我已添加了样本数据集