使用Python dict映射数据帧列时发生TypeError_Python_Pandas_Dictionary_Dataframe_Typeerror

使用Python dict映射数据帧列时发生TypeError

python pandas dictionary dataframe

使用Python dict映射数据帧列时发生TypeError,python,pandas,dictionary,dataframe,typeerror,Python,Pandas,Dictionary,Dataframe,Typeerror,我尝试使用如下映射将Pandas数据帧的一列转换为int值（包括给定的dataframe:my_dataframe和column:target_列）：我想知道为什么在熊猫身上使用Python 3.6 （A）引起 TypeError:“Series”对象是可变的，因此无法对其进行散列惠利斯（B）很好我想了解为什么会发生这种情况。替换中有没有什么神奇之处，比如没有抛出TypeError，或者我遗漏了什么？我已经知道，dict密钥是不允许更改的。但我还是很难真正理解这一点，因为：

我尝试使用如下映射将Pandas数据帧的一列转换为int值（包括给定的dataframe:my_dataframe和column:target_列）：

我想知道为什么在熊猫身上使用Python 3.6

（A）

引起

TypeError:“Series”对象是可变的，因此无法对其进行散列

惠利斯

（B）

很好

我想了解为什么会发生这种情况。替换中有没有什么神奇之处，比如没有抛出TypeError，或者我遗漏了什么？我已经知道，dict密钥是不允许更改的。但我还是很难真正理解这一点，因为：

    words = my_dataframe[target_column].unique()
    # words = ['car' 'bike' 'plain']

    foo = 'car'
    map_to_int[foo] = 0
    foo = 'bike'
    map_to_int["bike"] = 1

任何试图帮助我理解为什么B）可以在没有A）麻烦的情况下工作的尝试都将不胜感激。

显然

我的数据帧[target\u column]

是python（3.6）认为是可变的。在dict中使用可变内容作为键会抛出前面提到的类型错误。因此，调用像

map\u to\u int

这样的字典会抛出错误

在版本B）中，仍然使用字典

map_to_int

，但没有明确提到字典中的键。此外，它们是

目标

中所有内容的不变表示。因此，当replace函数（）使用字典时，它使用那些不可变键。因此，没有理由抛出TypeError，也就是所观察到的情况。

您的解决方案不起作用，因为使用

map\u to_int[my\u dataframe[target\u column]]

时，您试图使用

pd.Series

对象作为字典键

此外，我建议您仅在特定情况下使用

replace

；对于字典映射，您通常应该使用，即

my\u dataframe[target\u column].map（map\u to\u int）

。有关更多详细信息，请参阅

但是这个功能已经在Pandas as中实现了。我建议您使用分类数据作为一种高效且语法清晰的方法，将序列中的项映射为整数

下面是一个例子：

df = pd.DataFrame({'col1': ['a', 'b', 'c', 'a', 'b', 'a', 'd']})

df['col1'] = df['col1'].astype('category').cat.codes

print(df)

   col1
0     0
1     1
2     2
3     0
4     1
5     0
6     3

我在这里找到了一些关于字符串混淆部分的解释：foo的示例映射显然有效，因为标签foo后面的字符串“car”或“bike”是不可变的。即使标签foo可以指向各种“不可变目标”。

my_dataframe['Integer-Column'] = my_dataframe[target_column].replace(map_to_int)

    words = my_dataframe[target_column].unique()
    # words = ['car' 'bike' 'plain']

    foo = 'car'
    map_to_int[foo] = 0
    foo = 'bike'
    map_to_int["bike"] = 1

df = pd.DataFrame({'col1': ['a', 'b', 'c', 'a', 'b', 'a', 'd']})

df['col1'] = df['col1'].astype('category').cat.codes

print(df)

   col1
0     0
1     1
2     2
3     0
4     1
5     0
6     3