Python 将嵌套的dict/json转换为django ORM模型,无需硬编码数据结构

Python 将嵌套的dict/json转换为django ORM模型,无需硬编码数据结构,python,json,django,dictionary,Python,Json,Django,Dictionary,我想将json文件中的数据导入django数据库。json包含嵌套对象 目前的步骤是: 设置django对象模型以匹配json模式(手动完成-请参阅下面的models.py文件) 使用mydict=json.loads(file.read())(完成)将json文件导入python dict 将dict转换为django型号(完成-但解决方案并不完美) 是否有一种方法可以将嵌套的dict转换为django模型(即步骤3),而无需将数据结构硬编码到逻辑中? 基于示例json文件自动生成django

我想将json文件中的数据导入django数据库。json包含嵌套对象

目前的步骤是:

  • 设置django对象模型以匹配json模式(手动完成-请参阅下面的
    models.py
    文件)
  • 使用
    mydict=json.loads(file.read())
    (完成)将json文件导入python dict
  • 将dict转换为django型号(完成-但解决方案并不完美)
  • 是否有一种方法可以将嵌套的dict转换为django模型(即步骤3),而无需将数据结构硬编码到逻辑中?

    基于示例json文件自动生成django模型(即
    models.py
    文件)的额外点数

    提前感谢

    我现在是怎么做的 如果dict不包含任何嵌套dict,则步骤3很简单,即
    MyModel.objects.create(**mydict)

    但是,由于我的json/dict包含嵌套对象,我目前正在执行以下步骤3:

    # read the json file into a python dict
    d = json.loads(myfile.read())
    
    # construct top-level object using the top-level dict
    # (excluding nested lists of dicts called 'judges' and 'contestants')
    c = Contest.objects.create(**{k:v for k,v in d.items() if k not in ('judges', 'contestants')})
    
    # construct nested objects using the nested dicts
    for judge in d['judges']:
        c.judge_set.create(**judge)
    for contestant in d['contestants']:
        ct = c.contestant_set.create(**{k:v for k,v in contestant.items() if k not in ('singers', 'songs')})
        # all contestants sing songs
        for song in contestant['songs']:
            ct.song_set.create(**song)
        # not all contestants have a list of singers
        if 'singers' in contestant:
            for singer in contestant['singers']:
                ct.singer_set.create(**singer)
    
    {
      "assoc": "THE BRITISH ASSOCIATION OF BARBERSHOP SINGERS",
      "contest": "QUARTET FINAL (NATIONAL STREAM)",
      "location": "CHELTENHAM",
      "year": "2007/08",
      "date": "25/05/2008",
      "type": "quartet final",
      "filename": "BABS/2008QF.pdf"
      "judges": [
        {"cat": "m", "name": "Rod"},
        {"cat": "m", "name": "Bob"},
        {"cat": "p", "name": "Pat"},
        {"cat": "p", "name": "Bob"},
        {"cat": "s", "name": "Mark"},
        {"cat": "s", "name": "Barry"},
        {"cat": "a", "name": "Phil"}
      ],
      "contestants": [
        {
          "prev_tot_score": "1393",
          "tot_score": "2774",
          "rank_m": "1",
          "rank_s": "1",
          "rank_p": "1",
          "rank": "1", "name": "Monkey Magic",
          "pc_score": "77.1",
          "songs": [
            {"title": "Undecided Medley","m": "234","s": "226","p": "241"},
            {"title": "What Kind Of Fool Am I","m": "232","s": "230","p": "230"},
            {"title": "Previous","m": "465","s": "462","p": "454"}
          ],
          "singers": [
            {"part": "tenor","name": "Alan"},
            {"part": "lead","name": "Zac"},
            {"part": "bari","name": "Joe"},
            {"part": "bass","name": "Duncan"}
          ]
        },
        {
          "prev_tot_score": "1342",
          "tot_score": "2690",
          "rank_m": "2",
          "rank_s": "2",
          "rank_p": "2",
          "rank": "2", "name": "Evolution",
          "pc_score": "74.7",
          "songs": [
            {"title": "It's Impossible","m": "224","s": "225","p": "218"},
            {"title": "Come Fly With Me","m": "225","s": "222","p": "228"},
            {"title": "Previous","m": "448","s": "453","p": "447"}
          ],
          "singers": [
            {"part": "tenor","name": "Tony"},
            {"part": "lead","name": "Michael"},
            {"part": "bari","name": "Geoff"},
            {"part": "bass","name": "Stuart"}
          ]
        },
      ],
    }
    
    这是可行的,但需要将数据结构硬编码到逻辑中:

    • 调用
      create()
      时,需要硬编码要排除的嵌套dict的名称(如果试图将嵌套dict传递给
      create()
      会抛出
      TypeError
      )。我想改为使用
      **{k:v代表k,v在competitor.items()中,如果不是hasattr(v,'pop')}
      来排除列表和dict,但我怀疑这不会100%起作用
    • 需要硬编码逻辑以迭代方式创建嵌套对象
    • 需要硬编码逻辑来处理不总是存在的嵌套对象
    数据结构 示例json如下所示:

    # read the json file into a python dict
    d = json.loads(myfile.read())
    
    # construct top-level object using the top-level dict
    # (excluding nested lists of dicts called 'judges' and 'contestants')
    c = Contest.objects.create(**{k:v for k,v in d.items() if k not in ('judges', 'contestants')})
    
    # construct nested objects using the nested dicts
    for judge in d['judges']:
        c.judge_set.create(**judge)
    for contestant in d['contestants']:
        ct = c.contestant_set.create(**{k:v for k,v in contestant.items() if k not in ('singers', 'songs')})
        # all contestants sing songs
        for song in contestant['songs']:
            ct.song_set.create(**song)
        # not all contestants have a list of singers
        if 'singers' in contestant:
            for singer in contestant['singers']:
                ct.singer_set.create(**singer)
    
    {
      "assoc": "THE BRITISH ASSOCIATION OF BARBERSHOP SINGERS",
      "contest": "QUARTET FINAL (NATIONAL STREAM)",
      "location": "CHELTENHAM",
      "year": "2007/08",
      "date": "25/05/2008",
      "type": "quartet final",
      "filename": "BABS/2008QF.pdf"
      "judges": [
        {"cat": "m", "name": "Rod"},
        {"cat": "m", "name": "Bob"},
        {"cat": "p", "name": "Pat"},
        {"cat": "p", "name": "Bob"},
        {"cat": "s", "name": "Mark"},
        {"cat": "s", "name": "Barry"},
        {"cat": "a", "name": "Phil"}
      ],
      "contestants": [
        {
          "prev_tot_score": "1393",
          "tot_score": "2774",
          "rank_m": "1",
          "rank_s": "1",
          "rank_p": "1",
          "rank": "1", "name": "Monkey Magic",
          "pc_score": "77.1",
          "songs": [
            {"title": "Undecided Medley","m": "234","s": "226","p": "241"},
            {"title": "What Kind Of Fool Am I","m": "232","s": "230","p": "230"},
            {"title": "Previous","m": "465","s": "462","p": "454"}
          ],
          "singers": [
            {"part": "tenor","name": "Alan"},
            {"part": "lead","name": "Zac"},
            {"part": "bari","name": "Joe"},
            {"part": "bass","name": "Duncan"}
          ]
        },
        {
          "prev_tot_score": "1342",
          "tot_score": "2690",
          "rank_m": "2",
          "rank_s": "2",
          "rank_p": "2",
          "rank": "2", "name": "Evolution",
          "pc_score": "74.7",
          "songs": [
            {"title": "It's Impossible","m": "224","s": "225","p": "218"},
            {"title": "Come Fly With Me","m": "225","s": "222","p": "228"},
            {"title": "Previous","m": "448","s": "453","p": "447"}
          ],
          "singers": [
            {"part": "tenor","name": "Tony"},
            {"part": "lead","name": "Michael"},
            {"part": "bari","name": "Geoff"},
            {"part": "bass","name": "Stuart"}
          ]
        },
      ],
    }
    
    My models.py文件:

    from django.db import models
    
    # Create your models here.
    
    class Contest(models.Model):
        assoc = models.CharField(max_length=100)
        contest = models.CharField(max_length=100)
        date = models.DateField()
        filename = models.CharField(max_length=100)
        location = models.CharField(max_length=100)
        type = models.CharField(max_length=20)
        year = models.CharField(max_length=20)
    
    
    class Judge(models.Model):
        contest = models.ForeignKey(Contest, on_delete=models.CASCADE)
        name = models.CharField(max_length=60)
        cat = models.CharField('Category', max_length=2)
    
    
    class Contestant(models.Model):
        contest = models.ForeignKey(Contest, on_delete=models.CASCADE)
        name = models.CharField(max_length=100)
        tot_score = models.IntegerField('Total Score')
        rank_m = models.IntegerField()
        rank_s = models.IntegerField()
        rank_p = models.IntegerField()
        rank = models.IntegerField()
        pc_score = models.DecimalField(max_digits=4, decimal_places=1)
        # optional fields
        director = models.CharField(max_length=100, blank=True, null=True)
        size = models.IntegerField(blank=True, null=True)
        prev_tot_score = models.IntegerField(blank=True, null=True)
    
    
    class Song(models.Model):
        contestant = models.ForeignKey(Contestant, on_delete=models.CASCADE)
        title = models.CharField(max_length=100)
        m = models.IntegerField('Music')
        s = models.IntegerField('Singing')
        p = models.IntegerField('Performance')
    
    class Singer(models.Model):
        contestant = models.ForeignKey(Contestant, on_delete=models.CASCADE)
        name = models.CharField(max_length=100)
        part = models.CharField('Category', max_length=5)
    

    您可以递归浏览json对象,并使用键到类的映射来动态实例化模型。这里有一个想法(不是有效的解决方案!):


    但最后,我不鼓励这样做,因为您使用的是基于模式的存储(SQL),因此您的代码应该强制输入与模式匹配(无论如何,您不能动态处理任何不同的内容)。如果您根本不在乎拥有一个模式,那么选择一个无SQL的解决方案,您就不会有这个问题。或者像PostgresSQL这样的混合体。

    根据定义,对象模型包含数据模型的描述。如果你不知道你的数据是如何构造的,我很怀疑你能为它建模。最好的办法可能是使用一个ID将嵌套数据提取到一个单独的模型中,然后通过匹配该ID将其导入,但这将是相当多的工作。