在重复使用Perl时如何在替换s///中使用换行符？_Perl

在重复使用Perl时如何在替换s///中使用换行符？

perl

在重复使用Perl时如何在替换s///中使用换行符？,perl,Perl,输入文件包含多个换行符、空标记，如下所示： <html> <body> <title>XXX</title> <p>text...</p> <collaboration seq=""> <ce:text></ce:text> </collaboration> ... <p>text</p> <collaboration seq=""> &

输入文件包含多个换行符、空标记，如下所示：

<html>
<body>
<title>XXX</title>
<p>text...</p>
<collaboration seq="">
<ce:text></ce:text>
</collaboration>
...


<p>text</p>
<collaboration seq="">
<ce:text>AAA</ce:text>
</collaboration>
<p>text</p>
</body>
</html>


XXX
文本
...
正文
AAA
正文

输出文件只需要一个换行符，必须删除空标记

<html>
<body>
<title>XXX</title>
<p>text...</p>
...
<p>text</p>  
<p>text</p>
<collaboration seq="">
<ce:text>AAA</ce:text>
</collaboration>
</body>
</html>


XXX
文本
...
文本
正文
AAA

已尝试过的编码：

print "Enter the file name without extension: ";
chomp($filename=<STDIN>);
open(RED,"$filename.txt") || die "Could not open TXT file";
open(WRIT,">$filename.html");
while(<RED>)
{
  #process in file
  s/<collaboration seq="">\n<ce:text><\/ce:text>\n<\/collaboration>//g;
  s/\n\n//g;
  print WRIT $_;
}
close(RED);
close(WRIT);

print“输入不带扩展名的文件名：”；
chomp（$filename=）；
打开（红色，“$filename.txt”）| | die“无法打开txt文件”；
打开（写“>$filename.html”）；
while（）
{
#文件中的进程
s/\n\n//g；
s/\n\n//g；
打印写入$；
}
关闭（红色）；
关闭（令状）；

上面的编码并没有清除任何需要的内容。。。如何解决这一问题？

首先，您应该实际读取文件。假设您使用：

您可以使用XML:：Simple实现以下目的：

# use XML simple to process the XML
my $xs = XML::Simple->new(
      # remove extra whitespace
      NormaliseSpace => 2,
      # keep root element
      KeepRoot       => 1,
      # force elements to arrays
      ForceArray     => 1,
      # ignore empty elements
      SuppressEmpty  => 1
);
# read in the XML
my $ref = $xs->XMLin($xml);


# print out the XML minus the empty tags
print $xs->XMLout($ref);

您的脚本逐行运行，因此无法匹配多行。如果您有一个包含整个文本的字符串，则需要/m标志来进行多行匹配。@Jens:

/m

只需更改

和

匹配的内容即可；打印写入$；不要用正则表达式解析HTML/XML。帕特里克J.S。我大体上同意这一观点，因为Perl确实有很多很好的xml扩展。然而，这是一个单独的问题，只会不必要地模糊这里应该是一个简单的答案。

#process in file
$file ~= s/<collaboration seq="">\R<ce:text><\/ce:text>\R<\/collaboration>//g;
$file ~= s/\R{2,}/\n/g; #I'm guessing this is probably what you intended
print WRIT $file;

# use XML simple to process the XML
my $xs = XML::Simple->new(
      # remove extra whitespace
      NormaliseSpace => 2,
      # keep root element
      KeepRoot       => 1,
      # force elements to arrays
      ForceArray     => 1,
      # ignore empty elements
      SuppressEmpty  => 1
);
# read in the XML
my $ref = $xs->XMLin($xml);


# print out the XML minus the empty tags
print $xs->XMLout($ref);