python JSON复杂对象（考虑子类化）_Python_Json_Python 3.x_Design Patterns_Serialization

python JSON复杂对象（考虑子类化）

python json python-3.x design-patterns serialization

python JSON复杂对象（考虑子类化）,python,json,python-3.x,design-patterns,serialization,Python,Json,Python 3.x,Design Patterns,Serialization,将复杂python对象序列化/反序列化到JSON中/从JSON中序列化的最佳实践是什么，这将解释子类化并防止多次存储相同对象的多个副本（假设我们知道如何区分相同类的不同实例）简而言之，我正在写一个小型的科学图书馆，希望人们使用它。但是在看了Raymond Hettinger的演讲后，我决定对我来说实现子类化感知行为是一个很好的练习。到目前为止一切顺利，但现在我完成了JSON序列化任务到目前为止，我已经环顾四周，发现了以下关于Python中JSON序列化的内容：我遇到的两个主要障

将复杂python对象序列化/反序列化到JSON中/从JSON中序列化的最佳实践是什么，这将解释子类化并防止多次存储相同对象的多个副本（假设我们知道如何区分相同类的不同实例）

简而言之，我正在写一个小型的科学图书馆，希望人们使用它。但是在看了Raymond Hettinger的演讲后，我决定对我来说实现子类化感知行为是一个很好的练习。到目前为止一切顺利，但现在我完成了JSON序列化任务

到目前为止，我已经环顾四周，发现了以下关于Python中JSON序列化的内容：

我遇到的两个主要障碍是考虑可能的子类化，每个实例一个副本

在多次尝试用纯python解决这个问题之后，在没有对对象的JSON表示进行任何更改的情况下，我最终了解到，在反序列化JSON时，现在有办法知道之前序列化了哪个类的实例。因此，应该提到这一点，我最终得出了这样的结论：

class MyClassJSONEncoder(json.JSONEncoder):
    @classmethod
    def represent_object(cls, obj):
        """
        This is a way to serialize all built-ins as is, and all complex objects as their id, which is hash(obj) in this implementation
        """
        if isinstance(obj, (int, float, str, Boolean)) or value is None:
            return obj
        elif isinstance(obj, (list, dict, tuple)):
            return cls.represent_iterable(obj)
        else:
            return hash(obj)

    @classmethod
    def represent_iterable(cls, iterable):
        """
        JSON supports iterables, so they shall be processed 
        """
        if isinstance(iterable, (list, tuple)):
            return [cls.represent_object(value) for value in iterable]
        elif isinstance(iterable, dict):
            return [cls.represent_object(key): cls.represent_object(value) for key, value in iterable.items()]                

    def default(self, obj):
        if isinstance(obj, MyClass):
            result = {"MyClass_id": hash(obj), 
                      "py__class__": ":".join([obj.__class__.__module, obj.__class__.__qualname__]}
            for attr, value in self.__dict__.items():
                result[attr] = self.represent_object(value)
            return result
        return super().default(obj) # accounting for JSONEncoder subclassing

在这里，子类化的说明是在中完成的

"py__class__": ":".join([obj.__class__.__module, obj.__class__.__qualname__]

JSONDecoder的实现如下：

class MyClassJSONDecoder(json.JSONDecoder):
    def decode(self, data):
        if isinstance(data, str):
            data = super().decode(data)
        if "py__class__" in data:
            module_name, class_name = data["py__class__"].split(":")
            object_class = getattr(importlib.__import__(module_name, fromlist=[class_name]), class_name)
        else:
            object_class = MyClass
        data = {key, value for key, value in data.items() if not key.endswith("_id") or key != "py__class__"}
        return object_class(**data)

如可以看到的，这里我们考虑了在对象的JSON表示中可能存在的“Pyth-CaseSuxy”属性的子类，如果没有这样的属性（如果是在另一个程序中生成的JSON，比如C++，他们只想传递关于朴素MyC类对象的信息，并且不真正关心继承）创建

MyClass

被追查。顺便说一句，这就是为什么不能在所有对象中创建单个JSONDecoder的原因：如果未指定

py_uuuclass\uuuu

，则必须创建一个默认的类值

就每个实例的一个副本而言，这是通过以下事实实现的：该对象使用一个特殊的JSON键进行序列化

myclass\u id

，并且所有属性值都作为原语进行序列化（

列表

，

元组

，

dicts

，以及内置项被保留，而当复杂对象是某个属性的值时，只存储其哈希值）。这种存储对象哈希的方法允许将每个对象精确序列化一次，然后，在知道要从json表示中解码的对象的结构后，它可以查找相应的对象并分配它们。为了简单地说明这一点，可以观察以下示例：

class MyClass(object):
    json_encoder = MyClassJSONEncoder()
    json_decoder = MyClassJSONDecoder()

    def __init__(self, attr1):
        self.attr1 = attr1
        self.attr2 = [complex_object_1, complex_object_2]

    def to_json(self, top_level=None):
        if top_level is None:
            top_level = {}
        top_level["my_class"] = self.json_encoder.encode(self)
        top_level["complex_objects"] = [obj.to_json(top_level=top_level) for obj in self.attr2]
        return top_level

    @classmethod
    def from_json(cls, data, class_specific_data=None):
        if isinstance(data, str):
            data = json.loads(data)
        if class_specific_data is None:
            class_specific_data = data["my_class"] # I know the flat structure of json, and I know the attribute name, this class will be stored
        result = cls.json_decoder.decode(class_spcific_data)
        # repopulate complex valued attributes with real python objects
        # rather than their id aliases
        complex_objects = {co_data["ComplexObject_id"]: ComplexObject.from_json(data, class_specific_data=co_data) for co_data in data["complex_objects"]]
        result.complex_objects = [c_o for c_o_id, c_o in complex_objects.items() if c_o_id in self.complex_objects]
        # finish such repopulation
        return result

这是一个正确的方法吗？有没有更可靠的方法？我是否错过了一些在这种非常特殊的情况下实现的编程模式

我只是想了解实现JSON序列化的最正确和最具python风格的方法是什么，这种方法可以解释子类化并防止存储同一对象的多个副本