如何从json中提取包含父同级和子字符串的文本?
我能够使用上面的jq命令从上面的URL中提取以下信息如何从json中提取包含父同级和子字符串的文本?,json,substring,jq,export-to-pdf,pubmed,Json,Substring,Jq,Export To Pdf,Pubmed,我能够使用上面的jq命令从上面的URL中提取以下信息 377 387 562 579 584 602 659 676 681 699 919 936 941 959 1032 1049 1054 1072 但我真正需要的输出如下 最后一列只是text的子字符串,从begin+1到end(假设text中的字符串从1开始索引) 我不知道如何仅使用jq提取此信息,因为这涉及到获取父兄弟元素和另一个父兄弟元素的子字符串。有人能告诉我如何以这种格式提取输出吗?谢谢 32818866 3
377 387
562 579
584 602
659 676
681 699
919 936
941 959
1032 1049
1054 1072
但我真正需要的输出如下
最后一列只是text
的子字符串,从begin+1
到end
(假设text
中的字符串从1开始索引)
我不知道如何仅使用jq
提取此信息,因为这涉及到获取父兄弟元素和另一个父兄弟元素的子字符串。有人能告诉我如何以这种格式提取输出吗?谢谢
32818866 377 387 silica gel
32818866 562 579 7-methoxycoumarin
32818866 584 602 8-prenylkaempferol
32818866 659 676 7-methoxycoumarin
32818866 681 699 8-prenylkaempferol
32818866 919 936 7-methoxycoumarin
32818866 941 959 8-prenylkaempferol
32818866 1032 1049 7-methoxycoumarin
32818866 1054 1072 8-prenylkaempferol
这里的json txt是为了完整地显示这条消息
[
{
"project": "BERN",
"sourcedb": "PubMed",
"sourceid": "32818866",
"text": "Identification of two bitter components in Zanthoxylum bungeanum Maxim. and exploration of their bitter taste mechanism through receptor hTAS2R14. Bitterness is an inherent organoleptic characteristic affecting the flavor of Zanthoxylum bungeanum Maxim. In this study, the vital bitter components of Z. bungeanum were concentrated through solvent extraction, sensory analysis, silica gel chromatography, and thin-layer chromatographic techniques and subsequently identified by UPLC-Q-TOF-MS. Two components with the highest bitterness intensities (BIs), such as 7-methoxycoumarin and 8-prenylkaempferol were selected. The bitter taste perceived thresholds of 7-methoxycoumarin and 8-prenylkaempferol were 0.062 mmol/L and 0.022 mmol/L, respectively. Moreover, the correlation between the contents of the two bitter components and the BIs of Z. bungeanum were proved. The results of siRNA and flow cytometry showed that 7-methoxycoumarin and 8-prenylkaempferol could activate the bitter receptor hTAS2R14. The results concluded that 7-methoxycoumarin and 8-prenylkaempferol contribute to the bitter taste of Z. bungeanum.",
"denotations": [
{
"id": [
"NCBI:txid328401"
],
"span": {
"begin": 43,
"end": 64
},
"obj": "species"
},
{
"id": [
"CUI-less"
],
"span": {
"begin": 128,
"end": 145
},
"obj": "gene"
},
{
"id": [
"NCBI:txid328401"
],
"span": {
"begin": 225,
"end": 246
},
"obj": "species"
},
{
"id": [
"NCBI:txid328401"
],
"span": {
"begin": 300,
"end": 312
},
"obj": "species"
},
{
"id": [
"MESH:D058428",
"BERN:315272203"
],
"span": {
"begin": 377,
"end": 387
},
"obj": "drug"
},
{
"id": [
"CHEBI:5679",
"BERN:4597103"
],
"span": {
"begin": 562,
"end": 579
},
"obj": "drug"
},
{
"id": [
"MESH:C532177",
"BERN:280529003"
],
"span": {
"begin": 584,
"end": 602
},
"obj": "drug"
},
{
"id": [
"CHEBI:5679",
"BERN:4597103"
],
"span": {
"begin": 659,
"end": 676
},
"obj": "drug"
},
{
"id": [
"MESH:C532177",
"BERN:280529003"
],
"span": {
"begin": 681,
"end": 699
},
"obj": "drug"
},
{
"id": [
"NCBI:txid328401"
],
"span": {
"begin": 841,
"end": 853
},
"obj": "species"
},
{
"id": [
"CHEBI:5679",
"BERN:4597103"
],
"span": {
"begin": 919,
"end": 936
},
"obj": "drug"
},
{
"id": [
"MESH:C532177",
"BERN:280529003"
],
"span": {
"begin": 941,
"end": 959
},
"obj": "drug"
},
{
"id": [
"CUI-less"
],
"span": {
"begin": 979,
"end": 994
},
"obj": "gene"
},
{
"id": [
"CUI-less"
],
"span": {
"begin": 995,
"end": 1003
},
"obj": "gene"
},
{
"id": [
"CHEBI:5679",
"BERN:4597103"
],
"span": {
"begin": 1032,
"end": 1049
},
"obj": "drug"
},
{
"id": [
"MESH:C532177",
"BERN:280529003"
],
"span": {
"begin": 1054,
"end": 1072
},
"obj": "drug"
},
{
"id": [
"NCBI:txid328401"
],
"span": {
"begin": 1107,
"end": 1119
},
"obj": "species"
}
],
"timestamp": "Wed Oct 28 21:43:04 +0000 2020",
"logits": {
"disease": [],
"gene": [
[
{
"start": 128,
"end": 145,
"id": "CUI-less"
},
0.7066106796264648
],
[
{
"start": 979,
"end": 994,
"id": "CUI-less"
},
0.9999749660491943
],
[
{
"start": 995,
"end": 1003,
"id": "CUI-less"
},
0.9052715301513672
]
],
"drug": [
[
{
"start": 377,
"end": 387,
"id": "MESH:D058428\tBERN:315272203"
},
0.999982476234436
],
[
{
"start": 562,
"end": 579,
"id": "CHEBI:5679\tBERN:4597103"
},
0.9999980926513672
],
[
{
"start": 584,
"end": 602,
"id": "MESH:C532177\tBERN:280529003"
},
0.9999980926513672
],
[
{
"start": 659,
"end": 676,
"id": "CHEBI:5679\tBERN:4597103"
},
0.9999980926513672
],
[
{
"start": 681,
"end": 699,
"id": "MESH:C532177\tBERN:280529003"
},
0.9999980330467224
],
[
{
"start": 919,
"end": 936,
"id": "CHEBI:5679\tBERN:4597103"
},
0.9999980926513672
],
[
{
"start": 941,
"end": 959,
"id": "MESH:C532177\tBERN:280529003"
},
0.9999980926513672
],
[
{
"start": 1032,
"end": 1049,
"id": "CHEBI:5679\tBERN:4597103"
},
0.9999980926513672
],
[
{
"start": 1054,
"end": 1072,
"id": "MESH:C532177\tBERN:280529003"
},
0.9999980926513672
]
],
"species": [
[
{
"start": 43,
"end": 64,
"id": "NCBI:txid328401"
},
0.9999997615814209
],
[
{
"start": 225,
"end": 246,
"id": "NCBI:txid328401"
},
0.9999998211860657
],
[
{
"start": 300,
"end": 312,
"id": "NCBI:txid328401"
},
0.9999998211860657
],
[
{
"start": 841,
"end": 853,
"id": "NCBI:txid328401"
},
0.9999998211860657
],
[
{
"start": 1107,
"end": 1119,
"id": "NCBI:txid328401"
},
0.9999998211860657
]
]
},
"elapsed_time": {
"tmtool": 0.991,
"ner": 0.453,
"normalization": 0.172,
"total": 1.617
}
}
]
假设所需输出的第一列是“sourceid”,我们可以按如下方式调整您的解决方案:
.[]
| .sourceid as $id
| .text as $text
| .denotations[]
| select(.obj=="drug")
| .span
| [$id, .begin, .end, $text[.begin : .end] ]
| @tsv
@peak我添加了json文本以使消息完整。谢谢,现在已经很清楚了,但是为了将来的参考,请记住“m”in代表“minimal”:-)好的,请查看修改后的答案。
.[]
| .sourceid as $id
| .text as $text
| .denotations[]
| select(.obj=="drug")
| .span
| [$id, .begin, .end, $text[.begin : .end] ]
| @tsv