Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 稀疏数据帧头部错误_Python_Python 2.7_Pandas_Numpy_Dummy Variable - Fatal编程技术网

Python 稀疏数据帧头部错误

Python 稀疏数据帧头部错误,python,python-2.7,pandas,numpy,dummy-variable,Python,Python 2.7,Pandas,Numpy,Dummy Variable,我试图看到使用get_假人创建的稀疏数据帧的头部 import numpy as np import pandas as pd df = pd.read_csv(f_loc, sep = " ") print df.head() data = pd.get_dummies(df,sparse=True) print data.head() 这给了我一个错误 File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.p

我试图看到使用get_假人创建的稀疏数据帧的头部

import numpy as np
import pandas as pd

df = pd.read_csv(f_loc, sep = " ")
print df.head()
data = pd.get_dummies(df,sparse=True)
print data.head()
这给了我一个错误

File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 1703, in __init__
    raise TypeError("values must be {0}".format(self._holder.__name__))
在pandas的最新版本(
0.21.0
)中,它运行良好,但有些替代方案应该可以工作:

np.random.seed(1997)

df = pd.DataFrame(np.random.randn(100, 4))
df.iloc[:-97] = np.nan
data = df.to_sparse()

print (type(data))
<class 'pandas.core.sparse.frame.SparseDataFrame'>

print (data.head())
          0        1         2         3
0       NaN      NaN       NaN       NaN
1       NaN      NaN       NaN       NaN
2       NaN      NaN       NaN       NaN
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071

print (data.iloc[:5])
          0        1         2         3
0       NaN      NaN       NaN       NaN
1       NaN      NaN       NaN       NaN
2       NaN      NaN       NaN       NaN
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071

print (data[:5])
          0        1         2         3
0       NaN      NaN       NaN       NaN
1       NaN      NaN       NaN       NaN
2       NaN      NaN       NaN       NaN
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071
np.random.seed(1997)
df=pd.DataFrame(np.random.randn(100,4))
df.iloc[:-97]=np.nan
data=df.to_sparse()
打印(类型(数据))
打印(data.head())
0        1         2         3
0楠楠楠楠楠
1楠楠楠楠楠
2楠楠楠楠楠
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071
打印(data.iloc[:5])
0        1         2         3
0楠楠楠楠楠
1楠楠楠楠楠
2楠楠楠楠楠
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071
打印(数据[:5])
0        1         2         3
0楠楠楠楠楠
1楠楠楠楠楠
2楠楠楠楠楠
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071
在最新版本的pandas(
0.21.0
)中,它运行良好,但一些替代方案应该可以工作:

np.random.seed(1997)

df = pd.DataFrame(np.random.randn(100, 4))
df.iloc[:-97] = np.nan
data = df.to_sparse()

print (type(data))
<class 'pandas.core.sparse.frame.SparseDataFrame'>

print (data.head())
          0        1         2         3
0       NaN      NaN       NaN       NaN
1       NaN      NaN       NaN       NaN
2       NaN      NaN       NaN       NaN
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071

print (data.iloc[:5])
          0        1         2         3
0       NaN      NaN       NaN       NaN
1       NaN      NaN       NaN       NaN
2       NaN      NaN       NaN       NaN
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071

print (data[:5])
          0        1         2         3
0       NaN      NaN       NaN       NaN
1       NaN      NaN       NaN       NaN
2       NaN      NaN       NaN       NaN
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071
np.random.seed(1997)
df=pd.DataFrame(np.random.randn(100,4))
df.iloc[:-97]=np.nan
data=df.to_sparse()
打印(类型(数据))
打印(data.head())
0        1         2         3
0楠楠楠楠楠
1楠楠楠楠楠
2楠楠楠楠楠
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071
打印(data.iloc[:5])
0        1         2         3
0楠楠楠楠楠
1楠楠楠楠楠
2楠楠楠楠楠
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071
打印(数据[:5])
0        1         2         3
0楠楠楠楠楠
1楠楠楠楠楠
2楠楠楠楠楠
3  1.938422  1.78731 -0.619745 -2.560187
4 -0.986231 -1.94293  2.677379 -1.813071

我无法模拟,什么是
打印类型(数据)
?它是@jezraelhmm,
数据[:5]
?@jezrael。[5行x 11067列]可能问题是内存不足,所以会出现奇怪的错误:(我无法模拟,
打印类型(数据)
?是,@jezraelhmm,
数据[:5]
?@jezrael。[5行x 11067列]可能问题是内存不足,所以会出现奇怪的错误:(