需要以循环顺序获取一组数字-Python
我正在编写一些python脚本来生成JSON。我成功地构建了jSON。但在按循环顺序获取选择数时陷入了困境。 比如说,我有一个1,2,3,4,5的列表。我需要在这里选择前4个数字(1,2,3,4)作为第一个项目,选择2,3,4,5作为第二个项目,选择3,4,5,1作为第三个项目,应该持续到30次需要以循环顺序获取一组数字-Python,python,Python,我正在编写一些python脚本来生成JSON。我成功地构建了jSON。但在按循环顺序获取选择数时陷入了困境。 比如说,我有一个1,2,3,4,5的列表。我需要在这里选择前4个数字(1,2,3,4)作为第一个项目,选择2,3,4,5作为第二个项目,选择3,4,5,1作为第三个项目,应该持续到30次 import json import random json_dict = {} number = [] brokers = [1,2,3,4,5] json_dict["version"] = ve
import json
import random
json_dict = {}
number = []
brokers = [1,2,3,4,5]
json_dict["version"] = version
json_dict["partitions"] = [{"topic": "topic1", "name": i,"replicas":
random.choice(brokers)} for i in range(0, 30)]
with open("output.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4)
输出
"version": "1",
"partitions": [
{
"topic": "topic1",
"name": 0,
"replicas": 1,2,3,4
},
{
"topic": "topic1",
"name": 1,
"replicas": 2,3,4,5
},
{
"topic": "topic1",
"name": 3,
"replicas": 3,4,5,1
无论如何,我该如何实现这一点呢?为了从
代理列表中获取循环元素,您可以使用集合模块中的deque
并执行deque.rotation(-1)
,如以下示例所示:
from collections import deque
def grouper(iterable, elements, rotations):
if elements > len(iterable):
return []
b = deque(iterable)
for _ in range(rotations):
yield list(b)[:elements]
b.rotate(-1)
brokers = [1,2,3,4,5]
# Pick 4 elements from brokers and yield 30 cycles
cycle = list(grouper(brokers, 4, 30))
print(cycle)
输出:
[[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5,1, 2], [5, 1, 2, 3]]
{'partitions': [{'name': 0, 'replicas': [1, 2, 3, 4], 'topic': 'topic1'}, {'name': 1, 'replicas': [2, 3, 4, 5], 'topic': 'topic1'}, {'name': 2, 'replicas': [3, 4, 5, 1], 'topic': 'topic1'}, {'name': 3, 'replicas': [4, 5, 1, 2], 'topic': 'topic1'}, {'name': 4, 'replicas': [5, 1, 2, 3], 'topic': 'topic1'}], 'version': '1'}
此外,这是一种如何将此解决方案实施到最终dict的方法:
# in this example i'm using only 5 cycles
cycles = grouper(brokers, 4, 5)
partitions = [{"topic": "topic1", "name": i, "replicas": cycle_elem} for i, cycle_elem in zip(range(5), cycles)]
final_dict = {"version": "1", "partitions": partitions}
print(final_dict)
输出:
[[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5,1, 2], [5, 1, 2, 3]]
{'partitions': [{'name': 0, 'replicas': [1, 2, 3, 4], 'topic': 'topic1'}, {'name': 1, 'replicas': [2, 3, 4, 5], 'topic': 'topic1'}, {'name': 2, 'replicas': [3, 4, 5, 1], 'topic': 'topic1'}, {'name': 3, 'replicas': [4, 5, 1, 2], 'topic': 'topic1'}, {'name': 4, 'replicas': [5, 1, 2, 3], 'topic': 'topic1'}], 'version': '1'}
这是一个非常酷的问题,我想我有一个非常酷的解决方案:
items = [1, 2, 3, 4, 5]
[(items * 2)[x:x+4] for i in range(30) for x in [i % len(items)]]
给
[[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3]]
它所做的是将你的一组东西附加到它本身(items*2
->[1,2,3,4,5,1,2,3,4,5]
),然后选择一个起点(x
),通过循环迭代(i
)并根据我们拥有的项数(i in[x%len(项目)
)。这是一个纯粹的程序性解决方案,它还增加了选择任意数量、任意大小(甚至比原始“经纪人”列表更大)的组以及任意偏移的灵活性:
def get_subgroups(groups, base, size, offset=1):
# cover the group size > len(base) case by expanding the base
# this step is completely optional if your group size will never be bigger
base *= -(-size // len(base))
result = [] # storage for our groups
base_size = len(base) # no need to call len() all the time
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
tail = current_offset + size # end index for our current slice
end = min(tail, base_size) # normalize to the base size
group = base[current_offset:end] + base[:tail - end] # get our slice
result.append(group) # append it to our result storage
current_offset = (current_offset + offset) % base_size # increase our current offset
return result
brokers = [1, 2, 3, 4, 5]
print(get_subgroups(5, brokers, 4)) # 5 groups of size 4, with default offset
# prints: [[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3]]
print(get_subgroups(3, brokers, 7, 2)) # 3 groups of size 7, with offset 2
# prints: [[1, 2, 3, 4, 5, 1, 2], [3, 4, 5, 1, 2, 3, 4], [5, 1, 2, 3, 4, 5, 1]]
它在O(N)时间内通过一个循环完成
如果您计划在一个非常大的生成器上运行此函数,您可以将get_subgroups()
函数转换为生成器,方法是放弃result
集合并执行yield group
而不是result.append(group)
。这样,您可以在循环中调用它:for group in get_subgroups(30,broker,4):
并将组存储在所需的任何结构中
更新
如果内存不是问题,我们可以通过扩展整个库
(或者在您的情况下是代理
)以适应整个集合来进一步优化(处理方面):
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
result = [] # storage for our groups
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
result.append(base[current_offset:current_offset+size])
current_offset += offset
return result
或者,如果我们不需要将列表转换为生成器,我们可以通过列表理解使其更快:
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
return [base[i:i+size] for i in range(0, groups * offset, offset)]
# as previously mentioned, use xrange() on Python 2.x instead
您不需要在每个循环中将项增加一倍-在列表理解结构之外进行操作。我对@Chiheb Nexus answer感到满意。谢谢,伙计。不客气。如果您需要更多帮助或不了解此解决方案的工作原理,请毫不犹豫地对此答案发表评论。快乐编码:-)你的解决方案比我的快3倍左右。好方法+1itertools
通常是一个大麻烦,没有太多的优化空间——但速度上的损失可以提高生产率,而不必重新发明轮子;)。顺便说一句,如果内存不是问题,那么纯粹扩展broker
数组以覆盖所有的周期,然后只是在其中循环将非常快。。。我将更新我的答案,为子孙后代提供答案;)第一种简单的方法是使用itertools.cycle
,在我的回答中,我使用了deque
,因为它很容易使用,而且比处理itertools.cycle
更冗长。你的方法也很有效。哎呀,我还以为你的方法使用了一个惯犯来进行这种数据循环呢。这是罪过。@ChihebNexus-如果您仍然有计时代码/数据集,请检查更新的示例,它应该在性能方面终止它。。。