Python 熊猫-创建列数据类型对象或因子_Python_Pandas

Python 熊猫-创建列数据类型对象或因子

python pandas

Python 熊猫-创建列数据类型对象或因子,python,pandas,Python,Pandas,在pandas中，如何将数据帧的列转换为dtype对象？或者更好的是，成为一个因素？（对于那些说R的人，在Python中，我如何as.factor（）？）另外，pandas.Factor和pandas.category之间的区别是什么？您可以使用该方法播放一个系列（一列）：或整个数据帧： df = df.astype(object) 更新在系列/列中： df['col_name'] = df['col_name'].astype('category') 注：pd.Factor已被弃

在pandas中，如何将数据帧的列转换为dtype对象？或者更好的是，成为一个因素？（对于那些说R的人，在Python中，我如何

as.factor（）

？）

另外，

pandas.Factor

和

pandas.category

之间的区别是什么？

您可以使用该方法播放一个系列（一列）：

或整个数据帧：

df = df.astype(object)

更新在系列/列中：

df['col_name'] = df['col_name'].astype('category')

注：

pd.Factor

已被弃用，并已被删除，取而代之的是

pd.category

Factor

和

category

是相同的，据我所知。我认为它最初被称为因子，然后改为分类。要转换为分类，您可以使用pandas.category.from_array，如下所示：

In [27]: df = pd.DataFrame({'a' : [1, 2, 3, 4, 5], 'b' : ['yes', 'no', 'yes', 'no', 'absent']})

In [28]: df
Out[28]: 
   a       b
0  1     yes
1  2      no
2  3     yes
3  4      no
4  5  absent

In [29]: df['c'] = pd.Categorical.from_array(df.b).labels

In [30]: df
Out[30]: 
   a       b  c
0  1     yes  2
1  2      no  1
2  3     yes  2
3  4      no  1
4  5  absent  0

还可以使用pd.factorize函数：

# use the df data from @herrfz

In [150]: pd.factorize(df.b)
Out[150]: (array([0, 1, 0, 1, 2]), array(['yes', 'no', 'absent'], dtype=object))
In [152]: df['c'] = pd.factorize(df.b)[0]

In [153]: df
Out[153]: 
   a       b  c
0  1     yes  0
1  2      no  1
2  3     yes  0
3  4      no  1
4  5  absent  2

非常感谢，这已经成为一个非常头痛的问题。尝试此操作时，我得到了“TypeError:data type not Understanding”，我同时尝试使用数据['engagement']=data['engagement'].aType（数据）和数据=data.aType（数据）。我的列是非空float64您需要使用对象吗<代码>数据['engagement'].aType（对象）。。。如果它们已经是浮动的，为什么您要更改为object？注意：另外，当编写此原始答案时，创建了一个分类，然后将其设置为列，该列被转换为object（或另一个数据类型），因为您（直到0.15）无法拥有分类列/系列。请注意，上述用法已被弃用，并且需要如下使用：

pd.category（df.b）.code

# use the df data from @herrfz

In [150]: pd.factorize(df.b)
Out[150]: (array([0, 1, 0, 1, 2]), array(['yes', 'no', 'absent'], dtype=object))
In [152]: df['c'] = pd.factorize(df.b)[0]

In [153]: df
Out[153]: 
   a       b  c
0  1     yes  0
1  2      no  1
2  3     yes  0
3  4      no  1
4  5  absent  2