为什么在Python对象被pickle和重载后，某些属性会丢失？_Python_Python 3.x_Pickle

为什么在Python对象被pickle和重载后，某些属性会丢失？

python python-3.x

为什么在Python对象被pickle和重载后，某些属性会丢失？,python,python-3.x,pickle,Python,Python 3.x,Pickle,在将实例转储到pickle文件并重新加载后，我遇到了一个问题，即实例的某些属性丢失。有人能帮忙解释一下吗？谢谢以下是一个具体的例子： File/directory hierachy: -test -test_module -__init__.py -myDataFrameMapper.py -mySklearn.py -main.py __init_uuuuuuuuy.py： from .mySklearn import mySklearn mySklearn.py imp

在将实例转储到pickle文件并重新加载后，我遇到了一个问题，即实例的某些属性丢失。有人能帮忙解释一下吗？谢谢

以下是一个具体的例子：

File/directory hierachy:
-test
 -test_module
  -__init__.py
  -myDataFrameMapper.py
  -mySklearn.py
 -main.py

__init_uuuuuuuuy.py：

from .mySklearn import mySklearn

mySklearn.py

import sklearn_pandas as sk_pd
from .myDataFrameMapper import myDataFrameMapper

class mySklearn:
      def initialize():
          sk_pd.DataFrameMapper.myTransform = myDataFrameMapper.transform()

myDataFrameMapper.py

import numpy as np
from sklearn_pandas import DataFrameMapper

class myDataFrameMapper:

      def transform():
          def closure(self, df, **kwargs):
                 self.addedKey = 'addedValue' # a new attribute is added here
          return closure

main.py

import pandas as pd
import pickle
import random
from sklearn_pandas import DataFrameMapper
from sklearn.preprocessing import StandardScaler, LabelEncoder

from test_module import mySklearn

mySklearn.initialize()

data = {'pet':["cat", "dog", "dog", "fish", "cat", "dog", "cat", "fish"],
        'children':[4., 6, 3, 3, 2, 3, 5, 4],
        'salary':[90, 24, 44, 27, 32, 59, 36, 27]}

df = pd.DataFrame(data)

column_tuples = [
   ('pet', LabelEncoder()),
   ('children', LabelEncoder()),
   ('salary', LabelEncoder())
]

mapper = DataFrameMapper(column_tuples, input_df=True)
mapper.fit(data)

print('original attributes in mapper:')
print(mapper.__dict__)

mapper.myTransform(df.iloc[[1]])

print('\nafter adding a new attributes \'addedKey\':')
print(mapper.__dict__)

print('\ndump the mapper into a pickle file...')
picklefile = open('mapper.pkl', 'wb')
pickle.dump(mapper, picklefile)
picklefile.close()

print('\nload the mapper from the pickle file...')
picklefile = open('mapper.pkl', 'rb')
mapper1 = pickle.load(picklefile)
picklefile.close()

print('\nafter being loaded, the attributes in the mapper are:')
print(mapper1.__dict__)

运行

python3 main.py

后，我们观察以下输出：

original attributes in mapper:
{'built_default': False, 'sparse': False, 'input_df': True, 'df_out': False, 'features': [('pet', LabelEncoder()), ('children', LabelEncoder()), ('salary', LabelEncoder())], 'default': False, 'built_features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'transformed_names_': []}

after adding a new attributes 'addedKey':
{'built_default': False, 'addedKey': 'addedValue', 'sparse': False, 'input_df': True, 'df_out': False, 'features': [('pet', LabelEncoder()), ('children', LabelEncoder()), ('salary', LabelEncoder())], 'default': False, 'built_features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'transformed_names_': []}

dump the mapper into a pickler file:

load the mapper from the pickle file:

after being loaded, the attributes in the mapper are:
{'built_default': False, 'sparse': False, 'input_df': True, 'df_out': False, 'features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'default': False, 'built_features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'transformed_names_': []}

我们可以看到，当映射器从pickle文件加载回来时，属性

'addedKey'：'addedValue'

丢失了。

sklearn\u pandas。DataFrameMapper

有一个自定义方法，试图保持pickle与旧版本上创建的pickle的兼容性。（该方法的1.8.0版本。）此

\uuu setstate\uu

负责恢复未勾选实例的状态，它完全忽略添加的属性

Pickle实现定制是尝试将自己的属性添加到其他人的类中通常是个坏主意的原因之一。

您能提供一个完全可复制的示例吗？当我自己创建

Mapper

并在运行时添加

key3

时，我得到了预期的结果。（在取消酸洗后，我看到了所有三个键）。我怀疑您正在使用某种ORM框架。您需要指定哪一个，以及涉及哪些类。基本上，我们需要一个有效的解决方案。很抱歉造成混乱。我通过提供一个具体的工作示例来编辑我的问题。请再次检查，如果你有任何线索，请告诉我。@ShadowRanger谢谢你的建议。我用一个有效的例子更新了这个问题，请试着看看你是否能重现这个问题。谢谢__setstate\uuuuu从

state

中提取值并将其应用于属性，您能指出

state

的值是在哪里设置的吗？@QiZhang:数据最终来自原始pickle实例的

\uuuuuuu dict

，尽管这也是可自定义的。