Arrays 使用带jq的过滤器将字符串数组均匀地拆分为子数组_Arrays_Json_Select_Jq_Partition

Arrays 使用带jq的过滤器将字符串数组均匀地拆分为子数组

arrays json select

Arrays 使用带jq的过滤器将字符串数组均匀地拆分为子数组,arrays,json,select,jq,partition,Arrays,Json,Select,Jq,Partition,假设我有以下json [ "/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx", "/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx", "/home/test-spa/src/components/modals/edit

假设我有以下json

[
    "/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
    "/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
    "/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
    "/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
    "/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
    "/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
    "/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
    "/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
    "/home/test-spa/src/other-directory/modals/tests/index.test.ts",
    "/home/test-spa/src/directory/modals/tests/index.test.ts",
]

我想排除字符串中包含目录或其他目录的任何内容

然后我想将数组拆分为4个数组，但我想平均拆分字符串中具有积分的任何内容，即我不希望所有积分都在一个数组中。然后，可以在4个数组中拆分任何其他字符串

我想使用jq来执行这个过滤器。下面的代码允许我将json拆分为4个部分，但不执行上述所需的过滤

jq -cM '[_nwise(length / 4 | floor)]'
因此，我正在寻找类似以下输出的内容（只要集成测试被尽可能均匀地分割，其他字符串就可以均匀地填充，顺序无关紧要）
如果铲斗的数量是预先确定的下面是一个通用的“循环”函数，编写该函数时可以有效地执行“has”和“has not”字符串的分布（即，不连接任何数组）：
如果存储桶的数量是数据驱动的如果bucket的数量取决于指定字符串的出现次数，那么使用上述定义的
循环
过滤器，可以编写一个合理有效的解决方案，如下所示：

# First exclude the unwanted elements: map(select(test("(other-)?directory")|not)) # Form an array of the strings with the specified substring | map(select(index("integration"))) as $has # Perform the required round-robin: | roundrobin( $has[], ((.-$has)[]); $has|length)
如果铲斗的数量是预先确定的下面是一个通用的“循环”函数，编写该函数时可以有效地执行“has”和“has not”字符串的分布（即，不连接任何数组）：
如果存储桶的数量是数据驱动的如果bucket的数量取决于指定字符串的出现次数，那么使用上述定义的
循环
过滤器，可以编写一个合理有效的解决方案，如下所示：

# First exclude the unwanted elements: map(select(test("(other-)?directory")|not)) # Form an array of the strings with the specified substring | map(select(index("integration"))) as $has # Perform the required round-robin: | roundrobin( $has[], ((.-$has)[]); $has|length)

下面是我想到的，分成N个桶：

def bucket_shift($n): # loop through all input, shift each elem into bucket reduce .[] as $elem ( { count: 0, rv: [] }; (.rv[(.count % $n)] += [$elem] | .count += 1)) | .rv ; # get rid of everything with directory or other-directory [ .[] | select(test("directory|other-directory") | not) ] # grab all lines with "integration" in an array | [ ([ .[] | select(test("integration")) ]), # grab all lines without "integration" into a second array ([ .[] | select(test("integration") | not) ]) ] # flatten and divide into buckets (arg passed in) | flatten | bucket_shift($num_buckets|tonumber)
我标记了输入中的每一行，以便更容易地跟踪它们，然后添加了两行额外的行，这样结果就不会被您想要的桶数平均整除，以确保它能够很好地平衡。应过滤掉第I行和第J行

<~> $ jq . /tmp/so.json [ "A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx", "B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx", "C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx", "D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx", "E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx", "F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx", "G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx", "H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx", "IX/home/test-spa/src/other-directory/modals/tests/index.test.ts", "JX/home/test-spa/src/directory/modals/tests/index.test.ts", "K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx", "L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx" ]

下面是我想到的，分成N个桶：

def bucket_shift($n): # loop through all input, shift each elem into bucket reduce .[] as $elem ( { count: 0, rv: [] }; (.rv[(.count % $n)] += [$elem] | .count += 1)) | .rv ; # get rid of everything with directory or other-directory [ .[] | select(test("directory|other-directory") | not) ] # grab all lines with "integration" in an array | [ ([ .[] | select(test("integration")) ]), # grab all lines without "integration" into a second array ([ .[] | select(test("integration") | not) ]) ] # flatten and divide into buckets (arg passed in) | flatten | bucket_shift($num_buckets|tonumber)
我标记了输入中的每一行，以便更容易地跟踪它们，然后添加了两行额外的行，这样结果就不会被您想要的桶数平均整除，以确保它能够很好地平衡。应过滤掉第I行和第J行

<~> $ jq . /tmp/so.json [ "A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx", "B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx", "C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx", "D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx", "E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx", "F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx", "G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx", "H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx", "IX/home/test-spa/src/other-directory/modals/tests/index.test.ts", "JX/home/test-spa/src/directory/modals/tests/index.test.ts", "K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx", "L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx" ]

“4”来自哪里？它是一个独立于数据选择的数字，还是基于相关字符串中“集成”的出现次数？峰值数是必需的，因为它最终与要运行的所需线程数一致。在我的例子中，4是首选项，但是这个数字可以更改，它不是基于集成的出现次数，“4”来自哪里？它是一个独立于数据选择的数字，还是基于相关字符串中“集成”的出现次数？峰值数是必需的，因为它最终与要运行的所需线程数一致。在我的例子中，4是首选项，但是这个数字可以更改，并且它不是基于集成的出现次数假设在分割字符串以使用首选项数字时，$可以更改，而不是将每个集成拆分为其自己的ArrayTank以进行澄清。请查看更新。@peak-我很自豪地看到您的解决方案与我的基本相似，因为它表明我终于开始了解如何在jq中做事情了（至少是简单的事情）！不过，你做的更巧妙，这正是我所期待的。但是，有一件事我不明白，我们的函数和函数的使用有什么不同——你把输入当作一个流来处理，而不是我构建一个数组数组，然后我需要在将定义的函数作为过滤器传递之前将其展平吗？我想我要问的是：
def foo（）。。。；[x] |展平| foo
equal
def foo（s）。。。；foo（x）
？并且做
def foo（）。。。；x | foo
equal
def foo（f）。。。；foo（x）
？它们都是哪一个看起来更好的问题，还是有细微的（或不那么细微的）差异表明一种用法优于另一种用法？谢谢（总有一天我会记得，
map（x）
和
[.]|x]
是一样的，看起来也好多了。）@JoeCasadonte-是的，我的循环只是你的桶式移动的面向流的版本。这里有很多要说的，但简言之，最好从面向流的def开始，然后如果方便的话，使用它来定义面向阵列的版本。至于“扁平化”——它当然有它的位置，但它通常会限制解决方案的适用性，作为经验法则，最好避免使用，除非需求本质上需要它。假设在分割字符串时可以更改$has，以使用首选数字，而不是将每个集成拆分为自己的arraythank进行澄清。请查看更新。@peak-我很自豪地看到您的解决方案与我的基本相似，因为它表明我终于开始了解如何在jq中做事情了（至少是简单的事情）！不过，你做的更巧妙，这正是我所期待的。但是，有一件事我不明白，我们的函数和函数的使用有什么不同——你把输入当作一个流来处理，而不是我构建一个数组数组，然后我需要在将定义的函数作为过滤器传递之前将其展平吗？我想我要问的是：
def foo（）。。。；[x] |展平| foo
equal
def foo（s）。。。；foo（x）
？并且做
def foo（）。。。；x | foo
equal
def foo（f）。。。；foo（x）
？它们都是一个看起来更好的问题，还是有微妙的（或不那么微妙的）差异表明我们是一个
<~> $ cat /tmp/so.jq def bucket_shift($n): # loop through all input, shift each elem into bucket reduce .[] as $elem ( { count: 0, rv: [] }; (.rv[(.count % $n)] += [$elem] | .count += 1)) | .rv ; # get rid of everything with directory or other-directory [ .[] | select(test("directory|other-directory") | not) ] # grab all lines with "integration" in an array | [ ([ .[] | select(test("integration")) ]), # grab all lines without "integration" into a second array ([ .[] | select(test("integration") | not) ]) ] # flatten and divide into buckets (arg passed in) | flatten | bucket_shift($num_buckets|tonumber)

<~> $ jq --arg num_buckets 4 -f /tmp/so.jq /tmp/so.json [ [ "A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx", "L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx", "H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx" ], [ "C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx", "B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx", "K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx" ], [ "E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx", "D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx" ], [ "F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx", "G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx" ] ]

<~> $ jq --arg num_buckets 3 -f /tmp/so.jq /tmp/so.json [ [ "A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx", "F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx", "D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx", "K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx" ], [ "C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx", "L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx", "G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx" ], [ "E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx", "B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx", "H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx" ] ]

bucket_shift($ARGS.named["num_buckets"] // 4|tonumber)