Arrays 使用带jq的过滤器将字符串数组均匀地拆分为子数组
假设我有以下jsonArrays 使用带jq的过滤器将字符串数组均匀地拆分为子数组,arrays,json,select,jq,partition,Arrays,Json,Select,Jq,Partition,假设我有以下json [ "/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx", "/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx", "/home/test-spa/src/components/modals/edit
[
"/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
"/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"/home/test-spa/src/other-directory/modals/tests/index.test.ts",
"/home/test-spa/src/directory/modals/tests/index.test.ts",
]
jq -cM '[_nwise(length / 4 | floor)]'
因此,我正在寻找类似以下输出的内容(只要集成测试被尽可能均匀地分割,其他字符串就可以均匀地填充,顺序无关紧要)
如果铲斗的数量是预先确定的
下面是一个通用的“循环”函数,编写该函数时可以有效地执行“has”和“has not”字符串的分布(即,不连接任何数组):
如果存储桶的数量是数据驱动的
如果bucket的数量取决于指定字符串的出现次数,那么使用上述定义的循环
过滤器,可以编写一个合理有效的解决方案,如下所示:
# First exclude the unwanted elements:
map(select(test("(other-)?directory")|not))
# Form an array of the strings with the specified substring
| map(select(index("integration"))) as $has
# Perform the required round-robin:
| roundrobin( $has[], ((.-$has)[]); $has|length)
如果铲斗的数量是预先确定的
下面是一个通用的“循环”函数,编写该函数时可以有效地执行“has”和“has not”字符串的分布(即,不连接任何数组):
如果存储桶的数量是数据驱动的
如果bucket的数量取决于指定字符串的出现次数,那么使用上述定义的循环
过滤器,可以编写一个合理有效的解决方案,如下所示:
# First exclude the unwanted elements:
map(select(test("(other-)?directory")|not))
# Form an array of the strings with the specified substring
| map(select(index("integration"))) as $has
# Perform the required round-robin:
| roundrobin( $has[], ((.-$has)[]); $has|length)
下面是我想到的,分成N个桶:
def bucket_shift($n):
# loop through all input, shift each elem into bucket
reduce .[] as $elem ( { count: 0, rv: [] };
(.rv[(.count % $n)] += [$elem] | .count += 1))
| .rv ;
# get rid of everything with directory or other-directory
[ .[] | select(test("directory|other-directory") | not) ]
# grab all lines with "integration" in an array
| [ ([ .[] | select(test("integration")) ]),
# grab all lines without "integration" into a second array
([ .[] | select(test("integration") | not) ]) ]
# flatten and divide into buckets (arg passed in)
| flatten | bucket_shift($num_buckets|tonumber)
我标记了输入中的每一行,以便更容易地跟踪它们,然后添加了两行额外的行,这样结果就不会被您想要的桶数平均整除,以确保它能够很好地平衡。应过滤掉第I行和第J行
<~> $ jq . /tmp/so.json
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"IX/home/test-spa/src/other-directory/modals/tests/index.test.ts",
"JX/home/test-spa/src/directory/modals/tests/index.test.ts",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx"
]
下面是我想到的,分成N个桶:
def bucket_shift($n):
# loop through all input, shift each elem into bucket
reduce .[] as $elem ( { count: 0, rv: [] };
(.rv[(.count % $n)] += [$elem] | .count += 1))
| .rv ;
# get rid of everything with directory or other-directory
[ .[] | select(test("directory|other-directory") | not) ]
# grab all lines with "integration" in an array
| [ ([ .[] | select(test("integration")) ]),
# grab all lines without "integration" into a second array
([ .[] | select(test("integration") | not) ]) ]
# flatten and divide into buckets (arg passed in)
| flatten | bucket_shift($num_buckets|tonumber)
我标记了输入中的每一行,以便更容易地跟踪它们,然后添加了两行额外的行,这样结果就不会被您想要的桶数平均整除,以确保它能够很好地平衡。应过滤掉第I行和第J行
<~> $ jq . /tmp/so.json
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"IX/home/test-spa/src/other-directory/modals/tests/index.test.ts",
"JX/home/test-spa/src/directory/modals/tests/index.test.ts",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx"
]
“4”来自哪里?它是一个独立于数据选择的数字,还是基于相关字符串中“集成”的出现次数?峰值数是必需的,因为它最终与要运行的所需线程数一致。在我的例子中,4是首选项,但是这个数字可以更改,它不是基于集成的出现次数,“4”来自哪里?它是一个独立于数据选择的数字,还是基于相关字符串中“集成”的出现次数?峰值数是必需的,因为它最终与要运行的所需线程数一致。在我的例子中,4是首选项,但是这个数字可以更改,并且它不是基于集成的出现次数假设在分割字符串以使用首选项数字时,$可以更改,而不是将每个集成拆分为其自己的ArrayTank以进行澄清。请查看更新。@peak-我很自豪地看到您的解决方案与我的基本相似,因为它表明我终于开始了解如何在jq中做事情了(至少是简单的事情)!不过,你做的更巧妙,这正是我所期待的。但是,有一件事我不明白,我们的函数和函数的使用有什么不同——你把输入当作一个流来处理,而不是我构建一个数组数组,然后我需要在将定义的函数作为过滤器传递之前将其展平吗?我想我要问的是:
def foo()。。。;[x] |展平| foo
equaldef foo(s)。。。;foo(x)
?并且做def foo()。。。;x | foo
equaldef foo(f)。。。;foo(x)
?它们都是哪一个看起来更好的问题,还是有细微的(或不那么细微的)差异表明一种用法优于另一种用法?谢谢(总有一天我会记得,map(x)
和[.]|x]
是一样的,看起来也好多了。)@JoeCasadonte-是的,我的循环只是你的桶式移动的面向流的版本。这里有很多要说的,但简言之,最好从面向流的def开始,然后如果方便的话,使用它来定义面向阵列的版本。至于“扁平化”——它当然有它的位置,但它通常会限制解决方案的适用性,作为经验法则,最好避免使用,除非需求本质上需要它。假设在分割字符串时可以更改$has,以使用首选数字,而不是将每个集成拆分为自己的arraythank进行澄清。请查看更新。@peak-我很自豪地看到您的解决方案与我的基本相似,因为它表明我终于开始了解如何在jq中做事情了(至少是简单的事情)!不过,你做的更巧妙,这正是我所期待的。但是,有一件事我不明白,我们的函数和函数的使用有什么不同——你把输入当作一个流来处理,而不是我构建一个数组数组,然后我需要在将定义的函数作为过滤器传递之前将其展平吗?我想我要问的是:def foo()。。。;[x] |展平| foo
equaldef foo(s)。。。;foo(x)
?并且做def foo()。。。;x | foo
equaldef foo(f)。。。;foo(x)
?它们都是一个看起来更好的问题,还是有微妙的(或不那么微妙的)差异表明我们是一个
<~> $ cat /tmp/so.jq
def bucket_shift($n):
# loop through all input, shift each elem into bucket
reduce .[] as $elem ( { count: 0, rv: [] };
(.rv[(.count % $n)] += [$elem] | .count += 1))
| .rv ;
# get rid of everything with directory or other-directory
[ .[] | select(test("directory|other-directory") | not) ]
# grab all lines with "integration" in an array
| [ ([ .[] | select(test("integration")) ]),
# grab all lines without "integration" into a second array
([ .[] | select(test("integration") | not) ]) ]
# flatten and divide into buckets (arg passed in)
| flatten | bucket_shift($num_buckets|tonumber)
<~> $ jq --arg num_buckets 4 -f /tmp/so.jq /tmp/so.json
[
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx"
],
[
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
]
]
<~> $ jq --arg num_buckets 3 -f /tmp/so.jq /tmp/so.json
[
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
],
[
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
]
]
bucket_shift($ARGS.named["num_buckets"] // 4|tonumber)