Python 为数据集生成随机JSON结构排列

Python 为数据集生成随机JSON结构排列,python,json,permutation,Python,Json,Permutation,我希望生成许多不同的JSON结构排列,作为同一数据集的表示,最好不必硬编码实现。例如,给定以下JSON: {"name": "smith", "occupation": "agent", "enemy": "humanity", "nemesis": "neo"}` 应产生许多不同的排列,例如: 姓名变更:{“姓名”:“史密斯”}->{“姓氏”:“史密斯”} 更改顺序:{“姓名”:“…”,“职业”:“…”}->{“职业”:“…”,“姓名”:“…”} 安排变更:{“姓名”:“…”,“职业”:“

我希望生成许多不同的JSON结构排列,作为同一数据集的表示,最好不必硬编码实现。例如,给定以下JSON:

{"name": "smith", "occupation": "agent", "enemy": "humanity", "nemesis": "neo"}`
应产生许多不同的排列,例如:

  • 姓名变更:
    {“姓名”:“史密斯”}->{“姓氏”:“史密斯”}
  • 更改顺序:
    {“姓名”:“…”,“职业”:“…”}->{“职业”:“…”,“姓名”:“…”}
  • 安排变更:
    {“姓名”:“…”,“职业”:“…”}->“史密斯”:{“职业”:“…”}
  • 模板更改:
    {“名称”:“…”,“职业”:“…”}->“状态”:200,“数据”:{“名称”:“…”,“职业”:“…”}
  • 等等
目前,实施情况如下:

我正在使用
itertools.permutations
和orderedict()对可能的键和相应的值组合以及它们返回的顺序进行范围划分

key_permutations = SchemaLike(...).permutate()

all_simulacrums = []
for key_permutation in key_permutations:
   simulacrums = OrderedDict(key_permutation)
   all_simulacrums.append(simulacrums)
for x in itertools.permutations(all_simulacrums.items()):
    test_data = json.dumps(OrderedDict(p))
    print(test_data)
    assert json.loads(test_data) == data, 'Oops! {} != {}'.format(test_data, data)
我的问题发生在我尝试实现排列和模板的排列时。
我不知道如何最好地实现此功能,有什么建议吗?

对于订购,只需使用订购的DICT:

>>> data = OrderedDict(foo='bar', bacon='eggs', bar='foo', eggs='bacon')
>>> for p in itertools.permutations(data.items()):
...     test_data = json.dumps(OrderedDict(p))
...     print(test_data)
...     assert json.loads(test_data) == data, 'Oops! {} != {}'.format(test_data, data)

{"foo": "bar", "bacon": "eggs", "bar": "foo", "eggs": "bacon"}
{"foo": "bar", "bacon": "eggs", "eggs": "bacon", "bar": "foo"}
{"foo": "bar", "bar": "foo", "bacon": "eggs", "eggs": "bacon"}
{"foo": "bar", "bar": "foo", "eggs": "bacon", "bacon": "eggs"}
{"foo": "bar", "eggs": "bacon", "bacon": "eggs", "bar": "foo"}
{"foo": "bar", "eggs": "bacon", "bar": "foo", "bacon": "eggs"}
{"bacon": "eggs", "foo": "bar", "bar": "foo", "eggs": "bacon"}
{"bacon": "eggs", "foo": "bar", "eggs": "bacon", "bar": "foo"}
{"bacon": "eggs", "bar": "foo", "foo": "bar", "eggs": "bacon"}
{"bacon": "eggs", "bar": "foo", "eggs": "bacon", "foo": "bar"}
{"bacon": "eggs", "eggs": "bacon", "foo": "bar", "bar": "foo"}
{"bacon": "eggs", "eggs": "bacon", "bar": "foo", "foo": "bar"}
{"bar": "foo", "foo": "bar", "bacon": "eggs", "eggs": "bacon"}
{"bar": "foo", "foo": "bar", "eggs": "bacon", "bacon": "eggs"}
{"bar": "foo", "bacon": "eggs", "foo": "bar", "eggs": "bacon"}
{"bar": "foo", "bacon": "eggs", "eggs": "bacon", "foo": "bar"}
{"bar": "foo", "eggs": "bacon", "foo": "bar", "bacon": "eggs"}
{"bar": "foo", "eggs": "bacon", "bacon": "eggs", "foo": "bar"}
{"eggs": "bacon", "foo": "bar", "bacon": "eggs", "bar": "foo"}
{"eggs": "bacon", "foo": "bar", "bar": "foo", "bacon": "eggs"}
{"eggs": "bacon", "bacon": "eggs", "foo": "bar", "bar": "foo"}
{"eggs": "bacon", "bacon": "eggs", "bar": "foo", "foo": "bar"}
{"eggs": "bacon", "bar": "foo", "foo": "bar", "bacon": "eggs"}
{"eggs": "bacon", "bar": "foo", "bacon": "eggs", "foo": "bar"}
同样的原则也适用于键/值排列:

>>> for p in itertools.permutations(data.keys()):
...:     test_data = json.dumps(OrderedDict(zip(p, data.values())))
...:     print(test_data)
...:     
{"foo": "bar", "bacon": "eggs", "bar": "foo", "eggs": "bacon"}
{"foo": "bar", "bacon": "eggs", "eggs": "foo", "bar": "bacon"}
{"foo": "bar", "bar": "eggs", "bacon": "foo", "eggs": "bacon"}
{"foo": "bar", "bar": "eggs", "eggs": "foo", "bacon": "bacon"}
{"foo": "bar", "eggs": "eggs", "bacon": "foo", "bar": "bacon"}
{"foo": "bar", "eggs": "eggs", "bar": "foo", "bacon": "bacon"}
{"bacon": "bar", "foo": "eggs", "bar": "foo", "eggs": "bacon"}
{"bacon": "bar", "foo": "eggs", "eggs": "foo", "bar": "bacon"}
{"bacon": "bar", "bar": "eggs", "foo": "foo", "eggs": "bacon"}
{"bacon": "bar", "bar": "eggs", "eggs": "foo", "foo": "bacon"}
{"bacon": "bar", "eggs": "eggs", "foo": "foo", "bar": "bacon"}
{"bacon": "bar", "eggs": "eggs", "bar": "foo", "foo": "bacon"}
{"bar": "bar", "foo": "eggs", "bacon": "foo", "eggs": "bacon"}
{"bar": "bar", "foo": "eggs", "eggs": "foo", "bacon": "bacon"}
{"bar": "bar", "bacon": "eggs", "foo": "foo", "eggs": "bacon"}
{"bar": "bar", "bacon": "eggs", "eggs": "foo", "foo": "bacon"}
{"bar": "bar", "eggs": "eggs", "foo": "foo", "bacon": "bacon"}
{"bar": "bar", "eggs": "eggs", "bacon": "foo", "foo": "bacon"}
{"eggs": "bar", "foo": "eggs", "bacon": "foo", "bar": "bacon"}
{"eggs": "bar", "foo": "eggs", "bar": "foo", "bacon": "bacon"}
{"eggs": "bar", "bacon": "eggs", "foo": "foo", "bar": "bacon"}
{"eggs": "bar", "bacon": "eggs", "bar": "foo", "foo": "bacon"}
{"eggs": "bar", "bar": "eggs", "foo": "foo", "bacon": "bacon"}
{"eggs": "bar", "bar": "eggs", "bacon": "foo", "foo": "bacon"}
等等。。。如果不需要所有组合,您可以只使用一组预定义的键/值。您还可以使用
for
循环和
随机。选择
抛硬币以跳过某些组合,或使用
随机。洗牌
,以避免重复组合

对于模板,我想您必须创建一个不同模板的列表(如果您想要嵌套结构,则必须创建一个列表),然后在其上迭代以创建数据。为了给出更好的建议,我们需要一个更严格的规格说明

请注意,有几个库可以在Python中生成测试数据:

>>> from faker import Faker
>>> faker = Faker()
>>> faker.credit_card_full().strip().split('\n')
['VISA 13 digit', 'Jerry Gutierrez', '4885274641760 04/24', 'CVC: 583']

有多个模式,很容易创建您自己的自定义假数据提供程序。

既然dict顺序的洗牌已经被回答,我将跳过它

当我想到新事物时,我会补充这个答案

from random import randint
from collections import OrderedDict

#Randomly shuffles the key-value pairs of a dictionary
def random_dict_items(input_dict):
    items = input_dict.items()
    new_dict = OrderedDict()
    for i in items:
        rand = randint(0, 1)
        if rand == 0:
            new_dict[i[0]] = i[1]
        else:
            new_dict[i[1]] = i[0]
    return new_dict

Python
dict
是无序的集合(JSON对象也是无序的,但我想这就是您想要测试的)。使用
collections.OrderdDict
而不是普通的'dicts'。
dict
在python中是无序的,
json
对象的实现类似于
dicts
不,我希望能够在json中动态生成许多不同的json结构排列,作为同一数据集的表示,最好不必硬编码实现。感谢您的回答。关于名称的更改,有效选项将如何指定?非常感谢您的回答,然后您将如何使用itertools.permutations根据dict(或通过其他方法)更改上述指定的字段名称、模板和排列分别地