Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/perl/9.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Regex 把一条线分成两部分_Regex_Perl - Fatal编程技术网

Regex 把一条线分成两部分

Regex 把一条线分成两部分,regex,perl,Regex,Perl,我有一张乔治·迈克尔(George Michael)在亚马逊(Amazon)上的DVD曲目列表的剪贴画,$str,以及随后的代码,通过在前两位和其余数字上拆分来处理它: $str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flaw

我有一张乔治·迈克尔(George Michael)在亚马逊(Amazon)上的DVD曲目列表的剪贴画,
$str
,以及随后的代码,通过在前两位和其余数字上拆分来处理它:

$str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flawless (Go To The City) 30 Shoot The Dog 31 Roxanne 32 An Easier Affair 33 If I Told You That (with Whitney Houston) 34 Waltz Away Dreaming 35 Somebody To Love 36 I Can’t Make You Love Me 37 Star People '97 38 You Have Been Loved 39 Killer/ Papa Was A RollIn Stone 40 Round Here";

while ($str =~ /(\d{2}) (\S+)/g) {
        print "$1 $2\n";
}
结果:

20 Fastlove
21 Jesus
22 Spinning
23 Older
24 Outside
25 As
26 Freeek!
27 Amazing
28 John
29 Flawless
30 Shoot
31 Roxanne
32 An
33 If
34 Waltz
35 Somebody
36 I
37 Star
97 38
39 Killer/
40 Round
以上种类的作品,但不包括完整的曲目名称。关于如何取得我想要的结果,有什么建议吗?我期望或想要的结果是:

20 Fastlove
21 Jesus To A Child
22 Spinning the Wheel
[etc.]

正如伊格纳西奥·巴斯克斯·艾布拉姆斯所说,歌曲名和数字是一个问题,但这应该适用于除《明星人物97》之外的所有歌曲


注意:我不是Perl编码器,但正则表达式在rubular.com中工作正常(提到的“'97”案例除外)。

正如Ignacio所说,这不能100%准确地完成,因为曲目名称可以包含数字。但是,由于您可能会假设曲目编号是连续的,因此您可以非常接近100%:

my $str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flawless (Go To The City) 30 Shoot The Dog 31 Roxanne 32 An Easier Affair 33 If I Told You That (with Whitney Houston) 34 Waltz Away Dreaming 35 Somebody To Love 36 I Cant Make You Love Me 37 Star People '97 38 You Have Been Loved 39 Killer/ Papa Was A RollIn Stone 40 Round Here";

my ($track) = ($str =~ /^(\d+)/) or die "No initial track number";

my $next;
while ($next = $track + 1 and
       $str =~ s/^\s*             # optional initial whitespace
                 $track \s+       # track number followed by whitespace
                 (\S.*?)          # title begins with non-whitespace
                 (?= \s+ $next \s # title stops at next track #
                     | $ )        # or end-of-string
                //x) {
  print "$track $1\n";
  $track = $next;
}

die "$str left over" if $str =~ /\S/; # sanity check
这会修改
$str
,因此如果需要,请复制一份


如果一首曲目的标题包含下一个曲目编号,则此操作将失败,但这应该是相当罕见的。如果缺少磁道或磁道号不连续,它也会失败。

cjm答案的一种变体,以无损方式扫描输入字符串:

if ($str =~ /^(\d+)/) {
    my ($current, $next) = ($1, $1 + 1);
    while ($str =~ /\G *$current ((?:(?! *$next).)+)/g) {
        print "$current $1\n";
        ($current, $next) = ($next, $next + 1);
    }
}

您的最佳选择如下所示。但是,如果其中一个曲目包含下一个曲目的编号,即使它也会出现问题

#!/usr/bin/perl

use strict;
use warnings;

my $str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flawless (Go To The City) 30 Shoot The Dog 31 Roxanne 32 An Easier Affair 33 If I Told You That (with Whitney Houston) 34 Waltz Away Dreaming 35 Somebody To Love 36 I Can’t Make You Love Me 37 Star People '97 38 You Have Been Loved 39 Killer/ Papa Was A RollIn Stone 40 Round Here";

my @parts = split " ", $str;

my %songs;
my $track     = shift @parts;
my $new_track = $track + 1;
my $song      = "";
while (@parts) {
    my $part = shift @parts;
    unless ($part eq $new_track) {
        $song .= " $part";
        next;
    }
    $songs{$track} = $song;
    $song          = "";
    $track         = $new_track;
    $new_track     = $track + 1;
}

for my $track (sort { $a <=> $b } keys %songs) {
    print "$track\t$songs{$track}\n";
}
#/usr/bin/perl
严格使用;
使用警告;
我的$str=“20 Fastlove 21 Jesus To A Child 22 Wheel 23 Older 24 Out 25 As(与Mary J.Blige在一起)26 Freeek!27惊人的28 John和Elvis死了29完美无瑕(去城里)30射杀狗31 Roxanne 32如果我告诉你(与Whitney Houston在一起)这件事会更简单3334华尔兹远去梦想35有人爱36我不能让你爱我37明星人物97 38你被爱39杀手/爸爸是这里的一块滚石40“;
我的@parts=split”“$str;
我的%歌曲;
我的$track=shift@parts;
my$new_track=$track+1;
我的$song=“”;
而(@部分){
我的$part=shift@parts;
除非($part eq$新轨道){
$song.=“$part”;
下一个
}
$songs{$track}=$song;
$song=“”;
$track=$new_track;
$new_track=$track+1;
}
对于我的$track(对{$a$b}键%songs进行排序){
打印“$track\t$songs{$track}\n”;
}
还有另一种方法(:

while($str=~/(?)?
这假设一个或多个数字的任何序列后面跟空白,前面不跟非空白是一个曲目编号。这消除了曲目37标题中的
'97
,但没有任何东西可以阻止歌曲标题中有一个空编号

总的来说,我认为@cjm的连续数字想法可能是你最好的选择。

你太接近了:

$str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flawless (Go To The City) 30 Shoot The Dog 31 Roxanne 32 An Easier Affair 33 If I Told You That (with Whitney Houston) 34 Waltz Away Dreaming 35 Somebody To Love 36 I Can’t Make You Love Me 37 Star People '97 38 You Have Been Loved 39 Killer/ Papa Was A RollIn Stone 40 Round Here";

while ($str =~ /(\d{2}[^\d]*)/g) {
    print "$1\n";
}
注意正则表达式,我使用的是
[^]
语法来表示不是那个字符,[^\d]表示不是数字,末尾的星号表示零或更多

通过指定我希望字符串的其余部分继续,直到我找到一个数字,我可以选择名称的其余部分(即,直到Star People'97。该死。如此接近

如果需要在两个单独的变量中包含数字和标题,可以使用括号

$str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flawless (Go To The City) 30 Shoot The Dog 31 Roxanne 32 An Easier Affair 33 If I Told You That (with Whitney Houston) 34 Waltz Away Dreaming 35 Somebody To Love 36 I Can’t Make You Love Me 37 Star People '97 38 You Have Been Loved 39 Killer/ Papa Was A RollIn Stone 40 Round Here";

while ($str =~ /(\d{2})([^\d]*)/g) {
    my $number = $1;
    my $title = $2;

    print "$number: $title\n";
}

仍在试图找出如何让Star People'97发挥作用。我相信这与开头的单引号有关。所有数字前面都有空格或在一行的开头。我想知道是否可以使用空格?

我在这里对其中一个答案投了更高的票,因为我认为它很好地回答了您的具体问题,除了“此曲目名称包含下一首曲目的曲目编号”问题。具有此属性的相册将非常少

但是我不得不说,你的问题实际上源于最初的
$str
格式。例如,如果你看一下源代码,你可以很容易地从HTML本身提取曲目名称,而不考虑曲目名称


这是因为HTML清楚地描绘了这些轨迹。现在我不知道这些信息是否可用,但你可能想重新考虑一下你首先是如何获得这些数据的。这可能会让你的生活变得更轻松。或者,如果不是更轻松,至少更准确:-)

我无法理解这个问题。这是不可靠的,因为没有任何东西可以阻止歌曲名称中的数字。与其从Amazon上获取信息,为什么不使用像CDDB这样的曲目信息数据库?可能是因为没有与CDDB相当的DVD/BluRay?如果有,而且我错过了,我很想知道。或者如果有,我想知道这是一个Amazon API,可以在没有regex“体操”的情况下获取此信息,那么这也是一个理想的解决方案。这是一张CD。Amazon的DVD曲目列表不是这样组织的。我找不到OP的DVD,但是。然后我会在其他地方找到信息,例如或CDDB()或者其他信息提供商,他们应该能够为您提供更好的格式。答案其实只是为了为数据找到更好的来源,这意味着您不必使用纯文本玩“regex体操”。-)我喜欢您的术语“regex体操”。我会在其他地方寻找更好的来源。遗憾的是,还没有DVD DB或BRDB。:)我只是想要一些我可以作为非Perl程序员阅读的东西。这项工作做得很好,我喜欢。是的,它可以被使用。事实上,这正是我在回答中所做的。
$str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flawless (Go To The City) 30 Shoot The Dog 31 Roxanne 32 An Easier Affair 33 If I Told You That (with Whitney Houston) 34 Waltz Away Dreaming 35 Somebody To Love 36 I Can’t Make You Love Me 37 Star People '97 38 You Have Been Loved 39 Killer/ Papa Was A RollIn Stone 40 Round Here";

while ($str =~ /(\d{2}[^\d]*)/g) {
    print "$1\n";
}
$str = "20 Fastlove 21 Jesus To A Child 22 Spinning the Wheel 23 Older 24 Outside 25 As (with Mary J. Blige) 26 Freeek! 27 Amazing 28 John and Elvis are Dead 29 Flawless (Go To The City) 30 Shoot The Dog 31 Roxanne 32 An Easier Affair 33 If I Told You That (with Whitney Houston) 34 Waltz Away Dreaming 35 Somebody To Love 36 I Can’t Make You Love Me 37 Star People '97 38 You Have Been Loved 39 Killer/ Papa Was A RollIn Stone 40 Round Here";

while ($str =~ /(\d{2})([^\d]*)/g) {
    my $number = $1;
    my $title = $2;

    print "$number: $title\n";
}