Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/design-patterns/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Grep 格雷普:一种模式有效,但另一种不行_Grep_Design Patterns - Fatal编程技术网

Grep 格雷普:一种模式有效,但另一种不行

Grep 格雷普:一种模式有效,但另一种不行,grep,design-patterns,Grep,Design Patterns,我有一个teb分隔的文件,其中一列中有基因名称,另一列中有这些基因的表达式值。我想用grep删除这个文件中的某些基因。那么这个, "42261" "SNHG7" "20.2678" "42262" "SNHG8" "25.3981" "42263" "SNHG9" "0.488534" "42264" "SNIP1" "7.35454" "42265" "SNN" "2.05365" "42266" "snoMBII-202" "0" "42267" "snoMBII-202" "0

我有一个teb分隔的文件,其中一列中有基因名称,另一列中有这些基因的表达式值。我想用grep删除这个文件中的某些基因。那么这个,

"42261" "SNHG7" "20.2678"
"42262" "SNHG8" "25.3981"
"42263" "SNHG9" "0.488534"
"42264" "SNIP1" "7.35454"
"42265" "SNN"   "2.05365"
"42266" "snoMBII-202"   "0"
"42267" "snoMBII-202"   "0"
"42268" "snoMe28S-Am2634"   "0"
"42269" "snoMe28S-Am2634"   "0"
"42270" "snoR26"    "0"
"42271" "SNORA1"    "0"
"42272" "SNORA1"    "0"
变成这样:

"42261" "SNHG7" "20.2678"
"42262" "SNHG8" "25.3981"
"42263" "SNHG9" "0.488534"
"42264" "SNIP1" "7.35454"
"42265" "SNN"   "2.05365"
我使用了以下命令,这些命令与我有限的终端知识结合在一起:

grep -iv sno* <input.text> | grep -iv rp* | grep -iv U6* | grep -iv 7SK* > <output.txt>
grep-iv sno*| grep-iv rp*| grep-iv U6*| grep-iv 7SK*>
因此,通过这个命令,我的输出文件缺少以sno、u6和7sk开头的基因,但不知何故,grep删除了所有以“r”开头的基因,而不是以“rp”开头的基因。我对此很困惑。知道为什么sno*有效,rp*无效吗


谢谢

尽管这并不能直接回答您的问题,但在示例命令行中有一件事您可能需要小心:无论何时使用特殊的shell元字符(如“
*
”),都需要转义或引用它。因此,您的命令行应该更像:

grep -iv 'sno*' <input.text> | grep -iv 'rp*' | grep -iv 'U6*' | grep -iv '7SK*' > <output.txt>
grep-iv“sno*”| grep-iv“rp*”| grep-iv“U6*”| grep-iv“7SK*”>
通常,shell是智能的,如果没有与glob匹配的文件,它们将按原样使用文本(因此,如果输入“foo*”,但没有以“foo”开头的文件名,则字符串“foo*”将传递给命令)

测试:

kent$  cat b
"42261" "SNHG7" "20.2678"
"42262" "SNHG8" "25.3981"
"42263" "SNHG9" "0.488534"
"42264" "SNIP1" "7.35454"
"42265" "SNN"   "2.05365"
"42266" "snoMBII-202"   "0"
"42267" "snoMBII-202"   "0"
"42268" "snoMe28S-Am2634"   "0"
"42269" "snoMe28S-Am2634"   "0"
"42270" "snoR26"    "0"
"42271" "SNORA1"    "0"
"42272" "SNORA1"    "0"

kent$  grep -iEv "sno|rp|U6|7SK" b
"42261" "SNHG7" "20.2678"
"42262" "SNHG8" "25.3981"
"42263" "SNHG9" "0.488534"
"42264" "SNIP1" "7.35454"
"42265" "SNN"   "2.05365"

grep
命令使用正则表达式,而不是全局模式


模式
rp*
表示“r”后跟零个或多个“p”。你真正想要的是
rp.
,或者更好的是,
“rp.
(或者仅仅是
“rp
,在“rp”之后尝试任何东西都没有意义)。同样地,
sno*
表示“'sn'后跟零个或多个'o'”。同样,你会想要
sno.*
“sno.*
(甚至只是
“sno
)。

谢谢Adam,我一定会记住这一点,但是即使我使用引号,我也会松开以r开头的基因(或任何其他有r的基因),你能在这里粘贴一些输入示例和预期的输出吗?完成!你应该考虑一下。这是真的吗,你只需要第三列==0的行?哦,不,这只是巧合,以sno开头的行在第三列上的值是0。我想根据第二栏过滤谢谢KAK,似乎是罪魁祸首。我需要更多地了解正则表达式。
kent$  cat b
"42261" "SNHG7" "20.2678"
"42262" "SNHG8" "25.3981"
"42263" "SNHG9" "0.488534"
"42264" "SNIP1" "7.35454"
"42265" "SNN"   "2.05365"
"42266" "snoMBII-202"   "0"
"42267" "snoMBII-202"   "0"
"42268" "snoMe28S-Am2634"   "0"
"42269" "snoMe28S-Am2634"   "0"
"42270" "snoR26"    "0"
"42271" "SNORA1"    "0"
"42272" "SNORA1"    "0"

kent$  grep -iEv "sno|rp|U6|7SK" b
"42261" "SNHG7" "20.2678"
"42262" "SNHG8" "25.3981"
"42263" "SNHG9" "0.488534"
"42264" "SNIP1" "7.35454"
"42265" "SNN"   "2.05365"