为什么在Python对象被pickle和重载后,某些属性会丢失?
在将实例转储到pickle文件并重新加载后,我遇到了一个问题,即实例的某些属性丢失。有人能帮忙解释一下吗?谢谢 以下是一个具体的例子:为什么在Python对象被pickle和重载后,某些属性会丢失?,python,python-3.x,pickle,Python,Python 3.x,Pickle,在将实例转储到pickle文件并重新加载后,我遇到了一个问题,即实例的某些属性丢失。有人能帮忙解释一下吗?谢谢 以下是一个具体的例子: File/directory hierachy: -test -test_module -__init__.py -myDataFrameMapper.py -mySklearn.py -main.py __init_uuuuuuuuy.py: from .mySklearn import mySklearn mySklearn.py imp
File/directory hierachy:
-test
-test_module
-__init__.py
-myDataFrameMapper.py
-mySklearn.py
-main.py
__init_uuuuuuuuy.py:
from .mySklearn import mySklearn
mySklearn.py
import sklearn_pandas as sk_pd
from .myDataFrameMapper import myDataFrameMapper
class mySklearn:
def initialize():
sk_pd.DataFrameMapper.myTransform = myDataFrameMapper.transform()
myDataFrameMapper.py
import numpy as np
from sklearn_pandas import DataFrameMapper
class myDataFrameMapper:
def transform():
def closure(self, df, **kwargs):
self.addedKey = 'addedValue' # a new attribute is added here
return closure
main.py
import pandas as pd
import pickle
import random
from sklearn_pandas import DataFrameMapper
from sklearn.preprocessing import StandardScaler, LabelEncoder
from test_module import mySklearn
mySklearn.initialize()
data = {'pet':["cat", "dog", "dog", "fish", "cat", "dog", "cat", "fish"],
'children':[4., 6, 3, 3, 2, 3, 5, 4],
'salary':[90, 24, 44, 27, 32, 59, 36, 27]}
df = pd.DataFrame(data)
column_tuples = [
('pet', LabelEncoder()),
('children', LabelEncoder()),
('salary', LabelEncoder())
]
mapper = DataFrameMapper(column_tuples, input_df=True)
mapper.fit(data)
print('original attributes in mapper:')
print(mapper.__dict__)
mapper.myTransform(df.iloc[[1]])
print('\nafter adding a new attributes \'addedKey\':')
print(mapper.__dict__)
print('\ndump the mapper into a pickle file...')
picklefile = open('mapper.pkl', 'wb')
pickle.dump(mapper, picklefile)
picklefile.close()
print('\nload the mapper from the pickle file...')
picklefile = open('mapper.pkl', 'rb')
mapper1 = pickle.load(picklefile)
picklefile.close()
print('\nafter being loaded, the attributes in the mapper are:')
print(mapper1.__dict__)
运行python3 main.py
后,我们观察以下输出:
original attributes in mapper:
{'built_default': False, 'sparse': False, 'input_df': True, 'df_out': False, 'features': [('pet', LabelEncoder()), ('children', LabelEncoder()), ('salary', LabelEncoder())], 'default': False, 'built_features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'transformed_names_': []}
after adding a new attributes 'addedKey':
{'built_default': False, 'addedKey': 'addedValue', 'sparse': False, 'input_df': True, 'df_out': False, 'features': [('pet', LabelEncoder()), ('children', LabelEncoder()), ('salary', LabelEncoder())], 'default': False, 'built_features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'transformed_names_': []}
dump the mapper into a pickler file:
load the mapper from the pickle file:
after being loaded, the attributes in the mapper are:
{'built_default': False, 'sparse': False, 'input_df': True, 'df_out': False, 'features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'default': False, 'built_features': [('pet', LabelEncoder(), {}), ('children', LabelEncoder(), {}), ('salary', LabelEncoder(), {})], 'transformed_names_': []}
我们可以看到,当映射器从pickle文件加载回来时,属性
'addedKey':'addedValue'
丢失了。sklearn\u pandas。DataFrameMapper
有一个自定义方法,试图保持pickle与旧版本上创建的pickle的兼容性。(该方法的1.8.0版本。)此\uuu setstate\uu
负责恢复未勾选实例的状态,它完全忽略添加的属性
Pickle实现定制是尝试将自己的属性添加到其他人的类中通常是个坏主意的原因之一。您能提供一个完全可复制的示例吗?当我自己创建
Mapper
并在运行时添加key3
时,我得到了预期的结果。(在取消酸洗后,我看到了所有三个键)。我怀疑您正在使用某种ORM框架。您需要指定哪一个,以及涉及哪些类。基本上,我们需要一个有效的解决方案。很抱歉造成混乱。我通过提供一个具体的工作示例来编辑我的问题。请再次检查,如果你有任何线索,请告诉我。@ShadowRanger谢谢你的建议。我用一个有效的例子更新了这个问题,请试着看看你是否能重现这个问题。谢谢__setstate\uuuuu从state
中提取值并将其应用于属性,您能指出state
的值是在哪里设置的吗?@QiZhang:数据最终来自原始pickle实例的\uuuuuuu dict
,尽管这也是可自定义的。