Bash 删除出现字符串的多行并连接

Bash 删除出现字符串的多行并连接,bash,perl,Bash,Perl,我是Bash/Perl新手,尝试删除文本文件中出现字符串的多行。到目前为止,要删除一行,我有: perl -ne '/somestring/ or print' /usr/file.txt > /usr/file1.tmp 要替换我使用的第二行,请执行以下操作: perl -ne '/anotherstring/ or print' /usr/file.txt > /usr/file2.tmp 如何连接file和file2.tmp 或者,如何修改该命令以删除出现somestrin

我是Bash/Perl新手,尝试删除文本文件中出现字符串的多行。到目前为止,要删除一行,我有:

perl -ne '/somestring/ or print' /usr/file.txt > /usr/file1.tmp
要替换我使用的第二行,请执行以下操作:

perl -ne '/anotherstring/ or print' /usr/file.txt > /usr/file2.tmp
如何连接file和file2.tmp

或者,如何修改该命令以删除出现
somestring
另一个string
的多行

如何连接file和file2.tmp

这可以通过以下方式实现:

cat file file2.tmp >> file3.tmp
或者如果您所说的
file
是指
file1.tmp

cat file1.tmp file2.tmp >> file3.tmp

但是,这与您在问题的其余部分中所描述的不同(即删除出现两种模式中任何一种模式的任何一行)。这可以通过链接您的命令来实现:

perl -ne '/somestring/ or print' /usr/file.txt > /usr/file1.tmp
perl -ne '/anotherstring/ or print' /usr/file1.tmp > /usr/file2.tmp
您可以使用管道删除中间文件
file1.tmp

perl -ne '/somestring/ or print' /usr/file.txt | perl -ne '/anotherstring/ or print' > /usr/file2.tmp
这也可以通过使用
grep
(假设您的字符串没有使用任何特定于Perl的正则表达式特性)来实现:

最后,您可以将过滤合并到一个命令/regex中:

perl -ne '/somestring|anotherstring/ or print' /usr/file.txt > /usr/file2.tmp
或者使用
grep

grep -v 'somestring\|anotherstring' /usr/file.txt > /usr/file2.tmp

我对你的程序很感兴趣,并编写了一个高度动态的Perl程序 打印任何用户定义文件的每一行中的单词的匹配项或不匹配项,然后将与该文件匹配或不匹配的请求行右转到屏幕和新的用户定义输出文件

我们将分析此文件:iris_dataset.csv:

"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
5.1,3.5,1.4,0.2,"setosa"
4.9,3,1.4,0.2,"setosa"
4.8,3,1.4,0.3,"setosa"
5.1,3.8,1.6,0.2,"setosa"
4.6,3.2,1.4,0.2,"setosa"
7,3.2,4.7,1.4,"versicolor"
6.4,3.2,4.5,1.5,"versicolor"
6.9,3.1,4.9,1.5,"versicolor"
6.6,3,4.4,1.4,"versicolor"
5.5,2.4,3.7,1,"versicolor"
6.3,3.3,6,2.5,"virginica"
5.8,2.7,5.1,1.9,"virginica"
7.1,3,5.9,2.1,"virginica"
6.3,2.9,5.6,1.8,"virginica"
5.9,3,5.1,1.8,"virginica"
这是一个逗号分隔的值文件,列之间用逗号分隔。 如果您在电子表格中查看此文件,则可以更清楚地看到每一列项目。我们将要寻找的是文件的种类,所以可能匹配的项目是“setosa”、“versicolor”和“virginica”

我的程序首先要求您从中读取文件。。 在本例中,它是iris_dataset.csv,尽管它可以是任何文件。然后写下要写入的文件名。我叫它new_iris.csv,但你可以叫它任何东西

然后我们告诉程序我们要找多少个项目,如果有3个项目我可以输入:setosa,versicolor,virginica,任意顺序。如果有两个,我只能键入两个项目,如果有一个,那么我只能在这个示例文件中键入setosa、versicolor或virginica

然后我们被问到是否要保留与我们的项目匹配的行, 或者,如果要删除与文件匹配的文件行。如果我们保留匹配项,我们会将匹配这些项的行打印到屏幕和输出文件。如果选择remove,我们将得到与屏幕和文件中打印的项目不匹配的行。如果我们既不选择保留也不选择删除,则会收到一条错误消息,新的空输出文件将被删除,因为它不包含任何内容

#!/usr/bin/env perl
# Program: perl_matching.pl
use strict; # Means that we have to explicitly declare our variables with "my", "our" or "local" as we want their scope defined. 
use warnings; # We want to know if and if where errors are showing up in our program. 
use feature 'say'; # Like print, but with automatic ending newline.
use feature 'switch'; # Perl given:when switch statement. 
no warnings 'experimental'; # Perl has something against switch. 

########### This block of code right here is basically equivalent to a unit ls command ##############
opendir(DIR, "."); # Opens the current working directory 
my @files = readdir(DIR); # Reads all files in the current working directory into an array @files. 
closedir(DIR); # Now that we have the array of files, we can close our current working directory.
say "Here are the list of files in your current working directory";
foreach(@files){print "$_\t";} # $_ is the default variable for each item in an array.
########### It is not critical to run the program ####################  

say "\nGive me your filename to read from, extensions and all ..."; # It would be a good idea to have your filename in yoru working directory.
chomp(my $file_read = <STDIN>); # This makes the filename dynamic from user input. 
say "Give me your filename to write to, extensions and all ...";
chomp(my $file_write = <STDIN>); # results will be printed to this file, and standard output. # chomp removes newlines from standard input.

# ' < ' to read from, and '>', to write to ... 
# Opening your file to read from: 
open(my $filehandle_read, '<', $file_read) or die "Problem reading file $_ because $!";
# Open your file to write to. 
open(my $filehandle_write, '>', $file_write) or die "Problem reading file $_ because $!";

say "How many matches are you going to give me?";
my $match_num = <STDIN>;
say "Okay give me the matches now, pressing Enter key between each match.";

my $i = 1; # This is our incrementer between matches. 
my $matches; # This is each match presented line by line. 
my @match_list; # This is our array (list) of $matches
while($i <= $match_num)
{
    $matches = <STDIN>; # One match at a time from standard input. 
    push @match_list, $matches; # Pushes all individual $matches into a list @match_list
    $i = $i + 1; # Increase the incrementor by one so this loop don't last forever. 
}
chomp(@match_list);

undef($matches); # I am clearing each match, so that I can redefine this variable. 

$matches = join('|', @match_list); # " | " is part of a regular expression which means "or" for each item in this scalar matches. 
say "This is what your redefined matches variable looks like: $matches"; 

say "Now you get a choice for your matches"; 
say "KEEP or REMOVE?"; # if you type Keep (case insensitive) you print only the matches to the new file. If you type Remove (case insensitive) you print only the lines to the newfile which do not contain the matches.  
chomp(my $choice = <STDIN>);

my @lines_all = <$filehandle_read>; # The filehandle contains everything in the file, so we can pull all lines of the file to read into an array, where each item in the array is each line of the file opened for reading. 
close $filehandle_read; # we can now close the filehandle for the file for reading since we just pulled all the information from it. 
# We grep for the matching " =~ " lines of our file to read. 
my @lines_matching = grep{$_ =~ m/$matches/} @lines_all;
# We grep for the non-matching " !~ " lines of our file to read.
# Note: $_ is a default variable for every item in the array.   
my @lines_not_matching = grep{$_ !~ m/$matches/} @lines_all;


# This is a Perl style switch statement.
# Note: A given::when::when::default switch statement. 
# is basically equivalent to ...
# while::if::elsif::else statement. 

# In this switch statement only one choice is performed,
# which one depends on if you said "Keep" or "Remove" in your choice. 
given($choice)
{
    when($choice =~ m/Keep/i) # "i" is for case-insensitive, so Keep, KEEP, kEeP, etc are valid. 
    {
    say @lines_matching; # Print the matching lines to the screen. 
    print $filehandle_write @lines_matching; # Print the matching lines to the file. 
    close $filehandle_write; # Close the file now that we are done with it. 
    }
    when($choice =~ m/Remove/i) 
    {
    say @lines_not_matching; # Print the lines that match to the screen.
    print $filehandle_write @lines_not_matching; # Print the lines that do not match to the screen. 
    close $filehandle_write; # Close the file now that we are done with it.
    }
    default 
    {
    say "You must have selected a choice other than Keep or Remove. Don't do that!";
    close $filehandle_write; # Close the file now that we are done with it. 
    unlink($file_write) or warn "Could not unlink file $file_write"; # If you selected neither keep nor remove, we delete the new file to write to as it contains nothing.  
    }
}
因此,只有那些不包含setosa和versicolor字样的行才会打印到我们的文件:new_iris.csv:

"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
6.3,3.3,6,2.5,"virginica"
5.8,2.7,5.1,1.9,"virginica"
7.1,3,5.9,2.1,"virginica"
6.3,2.9,5.6,1.8,"virginica"
5.9,3,5.1,1.8,"virginica"
我完全喜欢用Perl处理标准输入。 您可以使用“我的脚本”仅打印包含
塞托萨。(您只要求进行一次匹配。)

perl-ne'/somestring | anotherstring/或print'/usr/file.txt>/usr/file2.tmp
但是
grep-v
在这里更合适。或者
egrip-v
如果您使用正则表达式。谢谢,但是
grep-v'somestring'/usr/file.txt
可以工作`但是
grep-v'sometherstring | anotherstring'/usr/file.txt
不会返回任何值结果这就是我使用Perl的原因。这是因为默认情况下,
grep
使用POSIX BRE(基本正则表达式),其中必须转义
,如下所示:
grep'somestring\\ anotherstring'…
。另一种方法是使用扩展正则表达式(ERE),通过
-E
标志启用,或者(按照Chris的建议),
egrep
启用。例如:
grep-E'somestring | anotherstring'..
@randomir我正在使用Solaris。我在
/usr/bin/egrep
中找到了egrep,因此您的
-E
解决方案现在可以工作了。谢谢你的帮助,“它是动态的”毫无意义
$matches
可以说是在其范围之外定义的<代码>未定义$matches是错误的样式。如果要在下一行中覆盖它,首先为什么要这样做?为什么要对两个完全不同的东西使用同一个变量?不要使用裸字文件句柄(在本例中为目录句柄)。
给定的
/
出于某种原因抛出警告时。不要在新代码中使用它;不要只是让警告静音。
取消链接的错误消息不包括
$。我想我想说的是,你写了很多有问题的代码,这些代码与问题无关。
  >perl perl_matching.pl
   Here are the list of files in your current working directory
.       ..      iris_dataset.csv        perl_matching.pl
Give me your filename to read from, extensions and all ...
iris_dataset.csv
Give me your filename to write to, extensions and all ...
new_iris.csv
How many matches are you going to give me?
2
Okay give me the matches now, pressing Enter key between each match.
setosa
versicolor
This is what your redefined matches variable looks like: setosa|versicolor
Now you get a choice for your matches
KEEP or REMOVE?
Remove
"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
6.3,3.3,6,2.5,"virginica"
5.8,2.7,5.1,1.9,"virginica"
7.1,3,5.9,2.1,"virginica"
6.3,2.9,5.6,1.8,"virginica"
5.9,3,5.1,1.8,"virginica"
"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
6.3,3.3,6,2.5,"virginica"
5.8,2.7,5.1,1.9,"virginica"
7.1,3,5.9,2.1,"virginica"
6.3,2.9,5.6,1.8,"virginica"
5.9,3,5.1,1.8,"virginica"