类似Python递归setattr()的函数,用于处理嵌套字典

类似Python递归setattr()的函数,用于处理嵌套字典,python,algorithm,recursion,nested,setattr,Python,Algorithm,Recursion,Nested,Setattr,有很多类似getattr()的函数用于解析嵌套字典结构,例如: 我想创建一个并行setattr()。基本上,考虑到: cmd = 'f[0].a' val = 'whatever' x = {"a":"stuff"} 我希望生成一个函数,以便我可以分配: x['f'][0]['a'] = val 或多或少,其工作方式与: setattr(x,'f[0].a',val) 产生: >>> x {"a":"stuff","f":[{"a":"whatever"}]}

有很多类似getattr()的函数用于解析嵌套字典结构,例如:

我想创建一个并行setattr()。基本上,考虑到:

cmd = 'f[0].a'
val = 'whatever'
x = {"a":"stuff"}
我希望生成一个函数,以便我可以分配:

x['f'][0]['a'] = val
或多或少,其工作方式与:

setattr(x,'f[0].a',val)
产生:

>>> x
{"a":"stuff","f":[{"a":"whatever"}]}
我现在称之为
setByDot()

一个问题是,如果中间的一个密钥不存在,则需要检查并生成一个中间密钥(如果它不存在)。

>>> x = {"a":"stuff"}
>>> x['f'][0]['a'] = val
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'f'
另一个是,下一项为列表时的键控与下一项为字符串时的键控不同,即:

>>> x = {"a":"stuff"}
>>> x['f']=['']
>>> x['f'][0]['a']=val
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
---这可能意味着算法必须生成一个中间值,如:

>>> x['f']=[{},{},{},{},{},{},{},{},{},{},{}]
>>> x['f'][10]['a']=val
屈服

>>> x 
{"a":"stuff","f":[{},{},{},{},{},{},{},{},{},{},{"a":"whatever"}]}
这样这就是与getter关联的setter

>>> getByDot(x,"f[10].a")
"whatever"
更重要的是,中间产物应该/不应该/覆盖已经存在的值

下面是我迄今为止的一个垃圾想法——我可以识别列表与dict和其他数据类型,并在它们不存在的地方创建它们。但是,我看不到(a)在哪里放置递归调用,或者(b)如何在遍历列表时“构建”深度对象,以及(c)如何区分在构建深度对象时正在进行的/probling/I与到达堆栈末尾时必须进行的/setting/I

def setByDot(obj,ref,newval):
    ref = ref.replace("[",".[")
    cmd = ref.split('.')
    numkeys = len(cmd)
    count = 0
    for c in cmd:
        count = count+1
        while count < numkeys:
            if c.find("["):
                idstart = c.find("[")
                numend = c.find("]")
                try:
                    deep = obj[int(idstart+1:numend-1)]
                except:
                    obj[int(idstart+1:numend-1)] = []
                    deep = obj[int(idstart+1:numend-1)]
            else:
                try:
                    deep = obj[c]
                except:
                    if obj[c] isinstance(dict):
                        obj[c] = {}
                    else:
                        obj[c] = ''
                    deep = obj[c]
        setByDot(deep,c,newval)
def setByDot(obj,ref,newval):
ref=ref.替换(“[”,“[”)
cmd=ref.split('.'))
numkeys=len(cmd)
计数=0
对于cmd中的c:
计数=计数+1
当计数
这看起来很棘手,因为如果你要做占位符,你必须向前看以检查/next/对象的类型,并且你必须向后看以在前进的过程中建立一条路径

更新


我最近也回答了这个问题,可能是相关的,也可能是有帮助的。

你可以通过解决两个问题来解决一些问题:

>>> class D(dict):
...     def __missing__(self, k):
...         ret = self[k] = D()
...         return ret
... 
>>> x=D()
>>> x['f'][0]['a'] = 'whatever'
>>> x
{'f': {0: {'a': 'whatever'}}}
  • 当访问超出范围时自动增长的列表(PaddedList)
  • 一种延迟决定创建什么的方法(dict列表),直到您第一次访问它(DictOrList)
  • 因此,代码将如下所示:

    import collections
    
    class PaddedList(list):
        """ List that grows automatically up to the max index ever passed"""
        def __init__(self, padding):
            self.padding = padding
    
        def __getitem__(self, key):
            if  isinstance(key, int) and len(self) <= key:
                self.extend(self.padding() for i in xrange(key + 1 - len(self)))
            return super(PaddedList, self).__getitem__(key)
    
    class DictOrList(object):
        """ Object proxy that delays the decision of being a List or Dict """
        def __init__(self, parent):
            self.parent = parent
    
        def __getitem__(self, key):
            # Type of the structure depends on the type of the key
            if isinstance(key, int):
                obj = PaddedList(MyDict)
            else:
                obj = MyDict()
    
            # Update parent references with the selected object
            parent_seq = (self.parent if isinstance(self.parent, dict)
                          else xrange(len(self.parent)))
            for i in parent_seq:
                if self == parent_seq[i]:
                    parent_seq[i] = obj
                    break
    
            return obj[key]
    
    
    class MyDict(collections.defaultdict):
        def __missing__(self, key):
            ret = self[key] = DictOrList(self)
            return ret
    
    def pprint_mydict(d):
        """ Helper to print MyDict as dicts """
        print d.__str__().replace('defaultdict(None, {', '{').replace('})', '}')
    
    x = MyDict()
    x['f'][0]['a'] = 'whatever'
    
    y = MyDict()
    y['f'][10]['a'] = 'whatever'
    
    pprint_mydict(x)
    pprint_mydict(y)
    
    class ProxyObject(FlexibleObject):
        @classmethod
        def new(cls,obj,quickrecdict,path,attribute_marker):
            self = ProxyObject(obj.__class__,obj)
            self.__dict__['reference'] = quickrecdict
            self.__dict__['path'] = path
            self.__dict__['attr_mark'] = attribute_marker
            return self
        def __getitem__(self,item):
            path = self.__dict__['path'] + [item]
            ref = self.__dict__['reference']
            return ref[tuple(path)]
        def __setitem__(self,item,val):
            path = self.__dict__['path'] + [item]
            ref = self.__dict__['reference']
            ref.dict[tuple(path)] = ProxyObject.new(val,ref,
                    path,self.__dict__['attr_mark'])
        def __getattribute__(self,attr):
            if attr == '__dict__':
                return object.__getattribute__(self,'__dict__')
            path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
            ref = self.__dict__['reference']
            return ref[tuple(path)]
        def __setattr__(self,attr,val):
            path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
            ref = self.__dict__['reference']
            ref.dict[tuple(path)] = ProxyObject.new(val,ref,
                    path,self.__dict__['attr_mark'])
    
    class UniqueValue(object):
        pass
    
    class QuickRecursiveDict(object):
        def __init__(self,dictionary={}):
            self.dict = dictionary
            self.internal_id = UniqueValue()
            self.attr_marker = UniqueValue()
        def __getitem__(self,item):
            if item in self.dict:
                val = self.dict[item]
                try:
                    if val.__dict__['path'][0] == self.internal_id:
                        return val
                    else:
                        raise TypeError
                except:
                    return ProxyObject.new(val,self,[self.internal_id,item],
                            self.attr_marker)
            try:
                if item[0] == self.internal_id:
                    return ProxyObject.new(KeyError(),self,list(item),
                            self.attr_marker)
            except TypeError:
                pass #Item isn't iterable
            return ProxyObject.new(KeyError(),self,[self.internal_id,item],
                        self.attr_marker)
        def __setitem__(self,item,val):
            self.dict[item] = val
    
    诀窍在于创建对象的defaultdict,这些对象可以是dict,也可以是列表,具体取决于访问方式。 因此,当您使用assignment
    x['f'][10]['a']='which'
    时,它将按以下方式工作:

  • 获取X['f']。它不存在,因此它将返回索引“f”的DictOrList对象
  • Get X['f'][10]。DictOrList。getitem将使用整数索引调用。DictOrList对象将在父集合中用填充列表替换自身
  • 访问PaddedList中的第11个元素将使其增长11个元素,并将在该位置返回MyDict元素
  • 给x['f'][10]['a']

  • PaddedList和DictOrList都有点不成熟,但是在所有的赋值之后,没有更多的魔力了,你有了一个dicts和List的结构。

    我将其分为两个步骤。在第一步中,查询字符串被分解成一系列指令。通过这种方式,问题被解耦,我们可以在运行之前查看指令对它们进行初始化,并且不需要递归调用

    def build_instructions(obj, q):
        """
        Breaks down a query string into a series of actionable instructions.
    
        Each instruction is a (_type, arg) tuple.
        arg -- The key used for the __getitem__ or __setitem__ call on
               the current object.
        _type -- Used to determine the data type for the value of
                 obj.__getitem__(arg)
    
        If a key/index is missing, _type is used to initialize an empty value.
        In this way _type provides the ability to
        """
        arg = []
        _type = None
        instructions = []
        for i, ch in enumerate(q):
            if ch == "[":
                # Begin list query
                if _type is not None:
                    arg = "".join(arg)
                    if _type == list and arg.isalpha():
                        _type = dict
                    instructions.append((_type, arg))
                    _type, arg = None, []
                _type = list
            elif ch == ".":
                # Begin dict query
                if _type is not None:
                    arg = "".join(arg)
                    if _type == list and arg.isalpha():
                        _type = dict
                    instructions.append((_type, arg))
                    _type, arg = None, []
    
                _type = dict
            elif ch.isalnum():
                if i == 0:
                    # Query begins with alphanum, assume dict access
                    _type = type(obj)
    
                # Fill out args
                arg.append(ch)
            else:
                TypeError("Unrecognized character: {}".format(ch))
    
        if _type is not None:
            # Finish up last query
            instructions.append((_type, "".join(arg)))
    
        return instructions
    
    以你为例

    >>> x = {"a": "stuff"}
    >>> print(build_instructions(x, "f[0].a"))
    [(<type 'dict'>, 'f'), (<type 'list'>, '0'), (<type 'dict'>, 'a')]
    
    因为我们可以,这里有一个
    \u getattr
    函数

    def _getattr(obj, query):
        """
        Very similar to _setattr. Instead of setting attributes they will be
        returned. As expected, an error will be raised if a __getitem__ call
        fails.
        """
        instructions = build_instructions(obj, query)
        for i, (_, arg) in enumerate(instructions[:-1]):
            _type = instructions[i + 1][0]
            obj = _get(obj, _type, arg)
    
        _type, arg = instructions[-1]
        return _get(obj, _type, arg)
    
    
    def _get(obj, _type, arg):
        """
        Helper function for calling obj.__getitem__(arg).
        """
        if isinstance(obj, dict):
            obj = obj[arg]
        elif isinstance(obj, list):
            arg = int(arg)
            obj = obj[arg]
        return obj
    
    在行动中:

    >>> x = {"a": "stuff"}
    >>> _setattr(x, "f[0].a", "test")
    >>> print x
    {'a': 'stuff', 'f': [{'a': 'test'}]}
    >>> print _getattr(x, "f[0].a")
    "test"
    
    >>> x = ["one", "two"]
    >>> _setattr(x, "3[0].a", "test")
    >>> print x
    ['one', 'two', [], [{'a': 'test'}]]
    >>> print _getattr(x, "3[0].a")
    "test"
    
    现在来看一些很酷的东西。与python不同,我们的
    \u setattr
    函数可以设置不可损坏的
    dict

    x = []
    _setattr(x, "1.4", "asdf")
    print x
    [{}, {'4': 'asdf'}]  # A list, which isn't hashable
    
    >>> y = {"a": "stuff"}
    >>> _setattr(y, "f[1.4]", "test")  # We're indexing f with 1.4, which is a list!
    >>> print y
    {'a': 'stuff', 'f': [{}, {'4': 'test'}]}
    >>> print _getattr(y, "f[1.4]")  # Works for _getattr too
    "test"
    
    我们并没有真正使用不可损坏的
    dict
    键,但看起来我们使用的是查询语言,所以谁在乎呢,对吧


    最后,您可以对同一个对象运行多个
    \u setattr
    调用,只需自己尝试一下。

    可以通过重写
    \u getitem\uuuuuu
    返回一个可以在原始函数中设置值的代理来合成递归设置项/属性

    def _getattr(obj, query):
        """
        Very similar to _setattr. Instead of setting attributes they will be
        returned. As expected, an error will be raised if a __getitem__ call
        fails.
        """
        instructions = build_instructions(obj, query)
        for i, (_, arg) in enumerate(instructions[:-1]):
            _type = instructions[i + 1][0]
            obj = _get(obj, _type, arg)
    
        _type, arg = instructions[-1]
        return _get(obj, _type, arg)
    
    
    def _get(obj, _type, arg):
        """
        Helper function for calling obj.__getitem__(arg).
        """
        if isinstance(obj, dict):
            obj = obj[arg]
        elif isinstance(obj, list):
            arg = int(arg)
            obj = obj[arg]
        return obj
    
    我碰巧正在处理一个库,它做了一些类似的事情,所以我正在处理一个类,它可以在实例化时动态分配自己的子类。这使处理这类事情变得更容易,但是如果这种黑客行为让你感到恶心,你可以通过创建一个类似于我创建的ProxyObject来获得类似的行为ate并通过在a函数中动态创建ProxyObject使用的各个类

    class ProxyObject(object):
        ... #see below
    
    def instanciateProxyObjcet(val):
       class ProxyClassForVal(ProxyObject,val.__class__):
           pass
       return ProxyClassForVal(val)
    
    您可以像我在下面的FlexibleObject中使用的那样使用字典,如果这是您实现它的方式,它将使该实现更加高效。不过,我将提供的代码使用FlexibleObject。现在它只支持类,就像几乎所有Python的内置类一样,可以通过实例作为其
    \uuuu init\uuuu
    /
    \uuuu new\uuuuu
    的唯一参数。在接下来的一两周内,我将添加对任何可pickle的支持,并链接到包含它的github存储库。代码如下:

    class FlexibleObject(object):
        """ A FlexibleObject is a baseclass for allowing type to be declared
            at instantiation rather than in the declaration of the class.
    
            Usage:
            class DoubleAppender(FlexibleObject):
                def append(self,x):
                    super(self.__class__,self).append(x)
                    super(self.__class__,self).append(x)
    
            instance1 = DoubleAppender(list)
            instance2 = DoubleAppender(bytearray)
        """
        classes = {}
        def __new__(cls,supercls,*args,**kws):
            if isinstance(supercls,type):
                supercls = (supercls,)
            else:
                supercls = tuple(supercls)
            if (cls,supercls) in FlexibleObject.classes:
                return FlexibleObject.classes[(cls,supercls)](*args,**kws)
            superclsnames = tuple([c.__name__ for c in supercls])
            name = '%s%s' % (cls.__name__,superclsnames)
            d = dict(cls.__dict__)
            d['__class__'] = cls
            if cls == FlexibleObject:
                d.pop('__new__')
            try:
                d.pop('__weakref__')
            except:
                pass
            d['__dict__'] = {}
            newcls = type(name,supercls,d)
            FlexibleObject.classes[(cls,supercls)] = newcls
            return newcls(*args,**kws)
    
    然后,要使用它来综合查找类似字典的对象的属性和项,可以执行以下操作:

    import collections
    
    class PaddedList(list):
        """ List that grows automatically up to the max index ever passed"""
        def __init__(self, padding):
            self.padding = padding
    
        def __getitem__(self, key):
            if  isinstance(key, int) and len(self) <= key:
                self.extend(self.padding() for i in xrange(key + 1 - len(self)))
            return super(PaddedList, self).__getitem__(key)
    
    class DictOrList(object):
        """ Object proxy that delays the decision of being a List or Dict """
        def __init__(self, parent):
            self.parent = parent
    
        def __getitem__(self, key):
            # Type of the structure depends on the type of the key
            if isinstance(key, int):
                obj = PaddedList(MyDict)
            else:
                obj = MyDict()
    
            # Update parent references with the selected object
            parent_seq = (self.parent if isinstance(self.parent, dict)
                          else xrange(len(self.parent)))
            for i in parent_seq:
                if self == parent_seq[i]:
                    parent_seq[i] = obj
                    break
    
            return obj[key]
    
    
    class MyDict(collections.defaultdict):
        def __missing__(self, key):
            ret = self[key] = DictOrList(self)
            return ret
    
    def pprint_mydict(d):
        """ Helper to print MyDict as dicts """
        print d.__str__().replace('defaultdict(None, {', '{').replace('})', '}')
    
    x = MyDict()
    x['f'][0]['a'] = 'whatever'
    
    y = MyDict()
    y['f'][10]['a'] = 'whatever'
    
    pprint_mydict(x)
    pprint_mydict(y)
    
    class ProxyObject(FlexibleObject):
        @classmethod
        def new(cls,obj,quickrecdict,path,attribute_marker):
            self = ProxyObject(obj.__class__,obj)
            self.__dict__['reference'] = quickrecdict
            self.__dict__['path'] = path
            self.__dict__['attr_mark'] = attribute_marker
            return self
        def __getitem__(self,item):
            path = self.__dict__['path'] + [item]
            ref = self.__dict__['reference']
            return ref[tuple(path)]
        def __setitem__(self,item,val):
            path = self.__dict__['path'] + [item]
            ref = self.__dict__['reference']
            ref.dict[tuple(path)] = ProxyObject.new(val,ref,
                    path,self.__dict__['attr_mark'])
        def __getattribute__(self,attr):
            if attr == '__dict__':
                return object.__getattribute__(self,'__dict__')
            path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
            ref = self.__dict__['reference']
            return ref[tuple(path)]
        def __setattr__(self,attr,val):
            path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
            ref = self.__dict__['reference']
            ref.dict[tuple(path)] = ProxyObject.new(val,ref,
                    path,self.__dict__['attr_mark'])
    
    class UniqueValue(object):
        pass
    
    class QuickRecursiveDict(object):
        def __init__(self,dictionary={}):
            self.dict = dictionary
            self.internal_id = UniqueValue()
            self.attr_marker = UniqueValue()
        def __getitem__(self,item):
            if item in self.dict:
                val = self.dict[item]
                try:
                    if val.__dict__['path'][0] == self.internal_id:
                        return val
                    else:
                        raise TypeError
                except:
                    return ProxyObject.new(val,self,[self.internal_id,item],
                            self.attr_marker)
            try:
                if item[0] == self.internal_id:
                    return ProxyObject.new(KeyError(),self,list(item),
                            self.attr_marker)
            except TypeError:
                pass #Item isn't iterable
            return ProxyObject.new(KeyError(),self,[self.internal_id,item],
                        self.attr_marker)
        def __setitem__(self,item,val):
            self.dict[item] = val
    
    实现的细节将根据您的需要而有所不同。显然,只重写
    \ug要容易得多
    
    class FlexibleObject(object):
        """ A FlexibleObject is a baseclass for allowing type to be declared
            at instantiation rather than in the declaration of the class.
    
            Usage:
            class DoubleAppender(FlexibleObject):
                def append(self,x):
                    super(self.__class__,self).append(x)
                    super(self.__class__,self).append(x)
    
            instance1 = DoubleAppender(list)
            instance2 = DoubleAppender(bytearray)
        """
        classes = {}
        def __new__(cls,supercls,*args,**kws):
            if isinstance(supercls,type):
                supercls = (supercls,)
            else:
                supercls = tuple(supercls)
            if (cls,supercls) in FlexibleObject.classes:
                return FlexibleObject.classes[(cls,supercls)](*args,**kws)
            superclsnames = tuple([c.__name__ for c in supercls])
            name = '%s%s' % (cls.__name__,superclsnames)
            d = dict(cls.__dict__)
            d['__class__'] = cls
            if cls == FlexibleObject:
                d.pop('__new__')
            try:
                d.pop('__weakref__')
            except:
                pass
            d['__dict__'] = {}
            newcls = type(name,supercls,d)
            FlexibleObject.classes[(cls,supercls)] = newcls
            return newcls(*args,**kws)
    
    class ProxyObject(FlexibleObject):
        @classmethod
        def new(cls,obj,quickrecdict,path,attribute_marker):
            self = ProxyObject(obj.__class__,obj)
            self.__dict__['reference'] = quickrecdict
            self.__dict__['path'] = path
            self.__dict__['attr_mark'] = attribute_marker
            return self
        def __getitem__(self,item):
            path = self.__dict__['path'] + [item]
            ref = self.__dict__['reference']
            return ref[tuple(path)]
        def __setitem__(self,item,val):
            path = self.__dict__['path'] + [item]
            ref = self.__dict__['reference']
            ref.dict[tuple(path)] = ProxyObject.new(val,ref,
                    path,self.__dict__['attr_mark'])
        def __getattribute__(self,attr):
            if attr == '__dict__':
                return object.__getattribute__(self,'__dict__')
            path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
            ref = self.__dict__['reference']
            return ref[tuple(path)]
        def __setattr__(self,attr,val):
            path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
            ref = self.__dict__['reference']
            ref.dict[tuple(path)] = ProxyObject.new(val,ref,
                    path,self.__dict__['attr_mark'])
    
    class UniqueValue(object):
        pass
    
    class QuickRecursiveDict(object):
        def __init__(self,dictionary={}):
            self.dict = dictionary
            self.internal_id = UniqueValue()
            self.attr_marker = UniqueValue()
        def __getitem__(self,item):
            if item in self.dict:
                val = self.dict[item]
                try:
                    if val.__dict__['path'][0] == self.internal_id:
                        return val
                    else:
                        raise TypeError
                except:
                    return ProxyObject.new(val,self,[self.internal_id,item],
                            self.attr_marker)
            try:
                if item[0] == self.internal_id:
                    return ProxyObject.new(KeyError(),self,list(item),
                            self.attr_marker)
            except TypeError:
                pass #Item isn't iterable
            return ProxyObject.new(KeyError(),self,[self.internal_id,item],
                        self.attr_marker)
        def __setitem__(self,item,val):
            self.dict[item] = val
    
    >>> qrd = QuickRecursiveDict
    >>> qrd[0][13] # returns an instance of a subclass of KeyError
    >>> qrd[0][13] = 9
    >>> qrd[0][13] # 9
    >>> qrd[0][13]['forever'] = 'young'
    >>> qrd[0][13] # 9
    >>> qrd[0][13]['forever'] # 'young'
    >>> qrd[0] # returns an instance of a subclass of KeyError
    >>> qrd[0] = 0
    >>> qrd[0] # 0
    >>> qrd[0][13]['forever'] # 'young'