Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/sockets/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Pandas Python SparseArray数据类型要浮动_Pandas_Sparse Matrix - Fatal编程技术网

Pandas Python SparseArray数据类型要浮动

Pandas Python SparseArray数据类型要浮动,pandas,sparse-matrix,Pandas,Sparse Matrix,熊猫:1.1.2 如何将sparsearray数据类型转换为float64数据类型 df id N_ERVisits N_admission N_diagnoses N_hospDays N_procedures 0 1 0.0 0.0 0.000090 0.0 0.000000 1 1 0.0 0.0 0.000000 0

熊猫:1.1.2

如何将sparsearray数据类型转换为float64数据类型

df
         id  N_ERVisits  N_admission  N_diagnoses  N_hospDays  N_procedures
0      1         0.0          0.0     0.000090         0.0      0.000000
1      1         0.0          0.0     0.000000         0.0      0.000000
2      1         0.0          0.0     0.000000         0.0      0.000000
3      1         0.0          0.0     0.000800         0.0      0.000000
4      1         0.0          0.0     0.000000         0.0      0.000000

df.dtypes
id                         int64
N_ERVisits      Sparse[float64, 0]
N_admission     Sparse[float64, 0]
N_diagnoses     Sparse[float64, 0]
N_hospDays      Sparse[float64, 0]
N_procedures    Sparse[float64, 0]
dtype: object
我想我可以进行标准转换:

df['N_ERVisits'] = df['N_ERVisits'].astype('float64')
df.dtypes
empi                           int64
N_ERVisits      Sparse[float64, 0.0]
N_admission       Sparse[float64, 0]
N_diagnoses       Sparse[float64, 0]
N_hospDays        Sparse[float64, 0]
N_procedures      Sparse[float64, 0]
dtype: object

如果不再需要稀疏性,请使用
SparseArray.values.to_densite()
将序列转换为密集的numpy数组。然后,
.astype()
函数按预期工作

import pandas as pd
import numpy as np

# data
arr = np.zeros((100,))
arr[1] = 1
arr[10] = 10

df = pd.DataFrame(data={
    'id': np.array(range(1, 101)),
    'col1': pd.arrays.SparseArray(arr, fill_value=0)
})
# df["col1"].values.dtype == Sparse[float64, 0]

# sparsity retained (note the difference in fill_value)
df["col2"] = df["col1"].astype(pd.SparseDtype(np.float64))
df["col3"] = df["col1"].astype(np.float64)

# no sparsity
df["col4"] = df["col1"].values.to_dense().astype(np.float64)
print(df.dtypes)
输出:

id                     int64
col1      Sparse[float64, 0]
col2    Sparse[float64, nan]
col3    Sparse[float64, 0.0]
col4                 float64
dtype: object
这个看似棘手的现象可以通过列的底层对象类型来理解。必须显式调用
.values
,才能处理底层的
SparseArray
本身

type(df["col1"])
Out[5]: pandas.core.series.Series

type(df["col1"].values)
Out[6]: pandas.core.arrays.sparse.array.SparseArray
注意:我的熊猫版本是1.0.3,但是行为应该是相同的