使用sed删除内联注释
我想使用使用sed删除内联注释,sed,comments,Sed,Comments,我想使用sed删除文本文件中的所有注释。假设注释从“A”字符开始,在新行字符结束。我想删除从“A”开始到行尾的所有内容,包括新行字符。但是,我不想删除以“AA”开头的注释 样本输入: %% comment to do not delete % comment to delete % another comment to delte %% comment to do not delete Some text % comment to delete and some more text %% com
sed
删除文本文件中的所有注释。假设注释从“A”字符开始,在新行字符结束。我想删除从“A”开始到行尾的所有内容,包括新行字符。但是,我不想删除以“AA”开头的注释
样本输入:
%% comment to do not delete
% comment to delete
% another comment to delte
%% comment to do not delete
Some text % comment to delete
and some more text %% comment to do not delete
期望输出:
%% comment to do not delete
%% comment to do not delete
Some text and some more text %% comment to do not delete
尝试这样做:
$ perl -pe '/^[^%]*%%/ && next; s/%.*\n//g' file.txt
输出
注
如果需要就地更改文件,请添加-i
开关(测试后),以便:
感谢您的贡献。尝试这样做:
$ perl -pe '/^[^%]*%%/ && next; s/%.*\n//g' file.txt
输出
注
如果需要就地更改文件,请添加-i
开关(测试后),以便:
感谢您的贡献。也许: 第二次更新
$ sed -e '/^%[^%]/d' -e 's/ %[^%]*$/@/' -e :a -e '/@/N; s/\n//; ta' input | sed 's/@/ /g'
%% comment to do not delete
%% comment to do not delete
Some text and some more text %% comment to do not delete
也许这是:
第二次更新
$ sed -e '/^%[^%]/d' -e 's/ %[^%]*$/@/' -e :a -e '/@/N; s/\n//; ta' input | sed 's/@/ /g'
%% comment to do not delete
%% comment to do not delete
Some text and some more text %% comment to do not delete
编辑添加的更改,使其在文件的最后一行正常工作。。。
尝试:
使用输入进行测试:
%% comment to do not delete
% comment to delete
% another comment to delte
%
%% comment to do not delete
Some text % comment to delete
Some more text % more comment to delete
and some more text %% comment to do not delete
fdgdfgdgdgd %
gfdgd
some text followed by %% comment to not delete that contains a % somewhere
some text followed by % comment to delete that contains %% somewhere
hello there
输出:
%% comment to do not delete
%% comment to do not delete
Some text Some more text and some more text %% comment to do not delete
fdgdfgdgdgd gfdgd
some text followed by %% comment to not delete that contains a % somewhere
some text followed by hello there
编辑添加的更改,使其在文件的最后一行正常工作。。。
尝试:
使用输入进行测试:
%% comment to do not delete
% comment to delete
% another comment to delte
%
%% comment to do not delete
Some text % comment to delete
Some more text % more comment to delete
and some more text %% comment to do not delete
fdgdfgdgdgd %
gfdgd
some text followed by %% comment to not delete that contains a % somewhere
some text followed by % comment to delete that contains %% somewhere
hello there
输出:
%% comment to do not delete
%% comment to do not delete
Some text Some more text and some more text %% comment to do not delete
fdgdfgdgdgd gfdgd
some text followed by %% comment to not delete that contains a % somewhere
some text followed by hello there
将表达式顺序与Sed一起使用
使用sed时,指令顺序可能很重要。例如:
$ sed -ne '/^% /d; /[^%]%.*/ {s/%.*//; n}; p' /tmp/corpus
%% comment to do not delete
%% comment to do not delete
and some more text %% comment to do not delete
在本例中,sed脚本按以下顺序执行其任务:
$ sed -ne '/^% /d; /[^%]%.*/ {s/%.*//; n}; p' /tmp/corpus
%% comment to do not delete
%% comment to do not delete
and some more text %% comment to do not delete
在本例中,sed脚本按以下顺序执行其任务:
此脚本与您在问题中提供的语料库配合使用。它不能保证在不修改的情况下使用任何其他语料库,并且如果您附加到模式空间的行包含注释字符,则显式不起作用。完美地应用了perl的否定查找后断言:
perl -pe 's/(?<!%)%(?!%).*$//s' << END
%% comment to do not delete
% comment to delete
% another comment to delte
%% comment to do not delete
Some text % comment to delete
and some more text %% comment to do not delete
END
s
标志确保dot与换行符匹配,以实现要求的“换行”
这种正则表达式匹配可能会给您带来问题,例如,如果您有一行
The date is `date +%Y%m%d` % this is a comment
你最终会得到
The date is `date +
如果您的实际注释需要空格,可以使用以下正则表达式:
(^| )%( .*|)$
也就是说
- 行或空格的开头
- 后面是注释字符
- 后跟(一个空格和零个或多个字符)或没有
- 然后是行尾
perl -pe 's/(?<!%)%(?!%).*$//s' << END
%% comment to do not delete
% comment to delete
% another comment to delte
%% comment to do not delete
Some text % comment to delete
and some more text %% comment to do not delete
END
s
标志确保dot与换行符匹配,以实现要求的“换行”
这种正则表达式匹配可能会给您带来问题,例如,如果您有一行
The date is `date +%Y%m%d` % this is a comment
你最终会得到
The date is `date +
如果您的实际注释需要空格,可以使用以下正则表达式:
(^| )%( .*|)$
也就是说
- 行或空格的开头
- 后面是注释字符
- 后跟(一个空格和零个或多个字符)或没有
- 然后是行尾
sed
版本。非常感谢。+1很好,很简单。使用/^[^%]*%%/
而不是/%%/
,它将处理需要删除但包含%%
的注释(在我的测试输入的最后一行之前)。它删除了所有注释,即使是从“%%”开始的注释,也没有删除新行字符。这一行似乎工作正常。但是,我更喜欢sed
版本。非常感谢。+1很好,很简单。使用/^[^%]*%%/
而不是/%%/
,它将处理需要删除但包含%%
(在我的测试输入的最后一行之前)的注释。它不适用于更复杂的输入。有关示例,请参见编辑的问题。谢谢。@MichalPietras-现在怎么样?你需要加入一行,而不是在下一行开头有注释?除此之外,我的sed提供了…是的,我还需要删除新行字符。如果注释从一行的开头开始,则应该删除整行。否则,它应该与下一行连接。就像在这个例子中。它在我的电脑上删除了很多。也许我有一个不同版本的sed
。我使用OSX。它不适用于更复杂的输入。有关示例,请参见编辑的问题。谢谢。@MichalPietras-现在怎么样?你需要加入一行,而不是在下一行开头有注释?除此之外,我的sed提供了…是的,我还需要删除新行字符。如果注释从一行的开头开始,则应该删除整行。否则,它应该与下一行连接。就像在这个例子中。它在我的电脑上删除了很多。也许我有一个不同版本的sed
。我使用OSX。乍一看很好,但在本示例的最新一行中失败:乍一看很好