Python 使用熊猫创建NumPy阵列_Python_Arrays_Numpy_Pandas_Scikit Learn

Python 使用熊猫创建NumPy阵列

python arrays numpy pandas scikit-learn

Python 使用熊猫创建NumPy阵列,python,arrays,numpy,pandas,scikit-learn,Python,Arrays,Numpy,Pandas,Scikit Learn,我正在尝试使用scikit处理一个电子表格（.xlsx）中的一些数据。为了实现这一点，我使用Pandas阅读电子表格，然后使用numpy使用scikit 这里的问题是，当我将DF结构转换为numpy时，我几乎丢失了所有的数据！我认为这是因为它没有列名称。只有原始数据。例： 28.7967 16.0021 2.6449 0.3918 0.1982 31.6036 11.7235 2.5185 0.5303 0.3773 162.052 136.031 4.0612 0.0374 0.0187 到目

我正在尝试使用scikit处理一个电子表格（.xlsx）中的一些数据。为了实现这一点，我使用Pandas阅读电子表格，然后使用numpy使用scikit

这里的问题是，当我将DF结构转换为numpy时，我几乎丢失了所有的数据！我认为这是因为它没有列名称。只有原始数据。例：

28.7967 16.0021 2.6449 0.3918 0.1982

31.6036 11.7235 2.5185 0.5303 0.3773

162.052 136.031 4.0612 0.0374 0.0187

到目前为止，我的代码是：

def split_data():
    test_data = pd.read_excel('magic04.xlsx', sheetname=0, skip_footer=16020)
    #code below prints correctly the data
    print test_data.iloc[:, 0:10] 

    #none of the code below work as expected 
    test1 = np.array(test_data.iloc[:, 0:10])
    test2 = test_data.as_matrix()

我在这里真的迷路了。非常欢迎您提供任何帮助。

我建议您在

read\u excel

中使用

header=None

。见下文：

df = pd.read_excel('stuff.xlsx')
>> df
    28.7967 16.0021 2.6449  0.3918  0.1982
0   31.6036 11.7235 2.5185  0.5303  0.3773
1   162.0520    136.0310    4.0612  0.0374  0.0187

>> df.ix[:, 1: 2]

0
1

与：

df = pd.read_excel('stuff.xlsx', header=None)
>> df

0   1   2   3   4
0   28.7967 16.0021 2.6449  0.3918  0.1982
1   31.6036 11.7235 2.5185  0.5303  0.3773
2   162.0520    136.0310    4.0612  0.0374  0.0187

>> df.ix[:, 1: 2]
    1   2
0   16.0021 2.6449
1   11.7235 2.5185
2   136.0310    4.0612

成功了！！！它以两种方式工作：属性“.iloc[：，0:X]”和方法“as_matrix（）”！真的谢谢！