Python pd.merge给出错误:数据帧';对象是可变的,因此不能对其进行散列

Python pd.merge给出错误:数据帧';对象是可变的,因此不能对其进行散列,python,pandas,dataframe,merge,Python,Pandas,Dataframe,Merge,我有一个数据帧dfCM,它是从另一个数据帧dfdict[dfCM]创建的,然后按如下方式处理: 删除了不需要的行 删除了不需要的列 添加了新的列 我现在需要将删除的列从dfdict[dfCM]添加回dfCM。注意,dfdict[dfCM]保存在数据帧字典中 以前我在代码中多次运行过类似的merge命令,但现在我得到了一个错误:DataFrame'对象是可变的,因此不能对它们进行散列处理 #add back deleted dfCM columns dfCM = pd.merge(dfCM, d

我有一个数据帧dfCM,它是从另一个数据帧dfdict[dfCM]创建的,然后按如下方式处理:

  • 删除了不需要的行
  • 删除了不需要的列
  • 添加了新的列
  • 我现在需要将删除的列从dfdict[dfCM]添加回dfCM。注意,dfdict[dfCM]保存在数据帧字典中

    以前我在代码中多次运行过类似的merge命令,但现在我得到了一个错误:DataFrame'对象是可变的,因此不能对它们进行散列处理

    #add back deleted dfCM columns 
    dfCM = pd.merge(dfCM, dfdict[dfCM], on=['ClaimID'], how = 'left', suffixes = ('', '_cm')) 
    #remove duplicate columns
    dfCM.filter(like='_cm',axis=1)
    
    这就是dfCM的样子(有更多的列和列):

    dfdict的屏幕截图如下:

    这就是dfdict[dfCM]的样子(有更多的行和列):

    我可以通过更改dfdict[dfCM]中的所有列名来进行合并,如下所示。但这并不理想,因为现在我无法区分添加到dfCM的重复列和唯一列,因此无法删除重复列

        #add back deleted dfCM columns
        dfdict['dfCM'] = dfdict['dfCM'].add_suffix('_cm') #identified columns from dfCL
        dfCM = pd.merge(dfCM, dfdict['dfCM'], left_on='ClaimID', right_on='ClaimID_cm', how = 'left', suffixes = ('', '_cm'))
    

    有没有更好的办法解决这个问题?谢谢

    您需要解释如何创建
    dfdict
    ,因为您试图使用数据帧作为字典的键,但您不能:

    import pandas as pd
    df1 = pd.DataFrame()
    df2 = pd.DataFrame()
    dfdict = {df1: 1, df2: 2}
    Traceback (most recent call last):
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-5-3207e8fd0e73>", line 1, in <module>
        {df1: 1, df2: 2}
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 1887, in __hash__
        " hashed".format(self.__class__.__name__)
    TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed
    
    将熊猫作为pd导入
    df1=pd.DataFrame()
    df2=pd.DataFrame()
    dfdict={df1:1,df2:2}
    回溯(最近一次呼叫最后一次):
    文件“/Users/dgolding/PycharmProjects/team general wiki/venv/lib/python3.6/site packages/IPython/core/interactiveshell.py”,第3326行,运行代码
    exec(代码对象、self.user\u全局、self.user\n)
    文件“”,第1行,在
    {df1:1,df2:2}
    文件“/Users/dgolding/PycharmProjects/team general wiki/venv/lib/python3.6/site packages/pandas/core/general.py”,第1887行,在_散列中__
    “散列”。格式(自类名称)
    TypeError:“DataFrame”对象是可变的,因此无法对其进行哈希
    
    也许您的字典键实际上是数据帧变量名的字符串?在这种情况下,当您尝试使用数据帧作为键来获取值时,会出现该错误:

    dfdict = {"df1": df1, "df2": df2}
    dfdict[df1]
    Traceback (most recent call last):
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-7-825e4ae2577b>", line 1, in <module>
        dfdict[df1]
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 1887, in __hash__
        " hashed".format(self.__class__.__name__)
    TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed
    
    dfdict={“df1”:df1,“df2”:df2}
    dfdict[df1]
    回溯(最近一次呼叫最后一次):
    文件“/Users/dgolding/PycharmProjects/team general wiki/venv/lib/python3.6/site packages/IPython/core/interactiveshell.py”,第3326行,运行代码
    exec(代码对象、self.user\u全局、self.user\n)
    文件“”,第1行,在
    dfdict[df1]
    文件“/Users/dgolding/PycharmProjects/team general wiki/venv/lib/python3.6/site packages/pandas/core/general.py”,第1887行,在_散列中__
    “散列”。格式(自类名称)
    TypeError:“DataFrame”对象是可变的,因此无法对其进行哈希
    

    可能您正试图这样做:
    dfdict[“dfCM”]

    dfdict是字典吗?如果是这样,并且
    dfCM
    是一个数据帧,那么您不能将其用作字典键,因为它是可变的,因为错误告诉您我尝试创建一个新的df,a=dfdict[dfCM],然后使用a代替dfdict[dfCM],但得到了相同的错误。你对如何解决这个问题有什么建议吗?是不是
    a=dfdict[dfCM]
    给了你这个错误?天哪!是的,我错过了引语!多么愚蠢的错误!谢谢你的帮助,丹。
    import pandas as pd
    df1 = pd.DataFrame()
    df2 = pd.DataFrame()
    dfdict = {df1: 1, df2: 2}
    Traceback (most recent call last):
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-5-3207e8fd0e73>", line 1, in <module>
        {df1: 1, df2: 2}
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 1887, in __hash__
        " hashed".format(self.__class__.__name__)
    TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed
    
    dfdict = {"df1": df1, "df2": df2}
    dfdict[df1]
    Traceback (most recent call last):
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-7-825e4ae2577b>", line 1, in <module>
        dfdict[df1]
      File "/Users/dgolding/PycharmProjects/team-general-wikis/venv/lib/python3.6/site-packages/pandas/core/generic.py", line 1887, in __hash__
        " hashed".format(self.__class__.__name__)
    TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed