Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ajax/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Perl 如何列出包含同一单词的多个句子。标题是包含在这些句子中的单词_Perl - Fatal编程技术网

Perl 如何列出包含同一单词的多个句子。标题是包含在这些句子中的单词

Perl 如何列出包含同一单词的多个句子。标题是包含在这些句子中的单词,perl,Perl,目前,它打印了所有的名词和句子,这些名词和句子可以在下面找到 #!/usr/bin/perl use strict; use warnings FATAL => "all"; my $search_key = "expend"; ## CHANGE "..." to <> open(my $tag_corpus, '<', "ch13tagged.txt") or die $!; my @sentences = <$tag_corpus>; #

目前,它打印了所有的名词和句子,这些名词和句子可以在下面找到

#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
my $search_key = "expend";    ## CHANGE "..." to <>

open(my $tag_corpus, '<', "ch13tagged.txt") or die $!;

my @sentences = <$tag_corpus>;    # This breaks up each line into list
my @words;
my %seens = ();
my %seenw = ();

for (my $i = 0; $i <= @sentences; $i++) {
    if (defined($sentences[$i]) and $sentences[$i] =~ /($search_key)_VB.*/i) {
        @words = split /\s/, $sentences[$i];    ## \s is a whitespace
        for (my $j = 0; $j <= @words; $j++) {
            #FILTER if word is noun, and therefore will end with _NN:
            if (defined($words[$j]) and $words[$j] =~ /_NN/) {
                #PRINT word (without _NN) and sentence (without any _ENDING):
                next if $seenw{$words[$j]}++;    ## How to include plural etc
                push @words, $words[$j];
                print "**", split(/_\S+/, $words[$j]), "**", "\n";
                ## next if $seens{ $sentences[$i] }++;
                ## push @sentences, $sentences[$i];
                print split(/_\S+/, $sentences[$i]), "\n"
                ## HOW PRINT bold or specifically word bold?
                #FILTER if word has been output, add sentence under that heading
            }
        }    ## put print sentences here to print each sentence after all the nouns inside
    }
}
close $tag_corpus || die "Can't close $tag_corpus: $!";
#/usr/bin/perl
严格使用;
使用致命警告=>“全部”;
我的$search_key=“expense”#将“…”更改为
打开(我的$tag_语料库),你的原创:

#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
这是一个好的开始

my $search_key = "expend";    ## CHANGE "..." to <>
这在很大程度上是相同的,但开销较小:

如果该行包含记录分隔符——并且它将包含,除非您
chomp
它,否则您将始终使用该分隔符 在文件结束前获取定义的行。无需测试定义的行

此外,您不需要在搜索词之后使用
*
,也不需要捕获
$search\u键
这里没有效果

        @words = split /\s/, $sentences[$i];    ## \s is a whitespace
您不希望在单个空格上分割空白。您应该使用
/\s+/
,但是 更好的是:
@words=split',$statemens[$i];

但你甚至不需要这个

        for (my $j = 0; $j <= @words; $j++) {
            #FILTER if word is noun, and therefore will end with _NN:
            if (defined($words[$j]) and $words[$j] =~ /_NN/) {
                #PRINT word (without _NN) and sentence (without any _ENDING):
除非您想在每个句子后重置
%seenw
,否则您只能处理每个
\u NN
word每个文件一次

                push @words, $words[$j];
我不明白这个
push
如何通过附加名词来达到任何可能的目的 返回单词列表。确保在保存之前已进行唯一性检查 如果有任何
\u NN
单词,您将从无限循环中退出,但这只意味着您将拥有 句子中的所有单词,后面跟着所有的“名词”。不仅如此,你还很简单 去测试它是一个名词,什么都不做,更不用说你 用下一句话填空列表

                print "**", split(/_\S+/, $words[$j]), "**", "\n";

                ## next if $seens{ $sentences[$i] }++; 
你不想在单词循环中这样做

                ## push @sentences, $sentences[$i];
再说一次,我不认为如果它没有注释,你会想这样做 在单词loop之外,似乎2行之前的所有内容都是 在单词loop之后

                print split(/_\S+/, $sentences[$i]), "\n"
                ## HOW PRINT bold or specifically word bold?
                #FILTER if word has been output, add sentence under that heading
            }
        }    ## put print sentences here to print each sentence after all the nouns inside
    }
}
close $tag_corpus || die "Can't close $tag_corpus: $!";
不。这不会处理关闭时的错误返回。
|
或也处于“绑定”状态 紧紧地。您正在关闭
$tag\u corpus
或die的输出。幸运的(或者可能是不幸的) 骰子永远不会被调用,因为如果我们走到这一步,
$tag\u corpus
应该是一个 真正的价值

这是一种清理版本,您正试图用 我能理解的部分留在了里面

my @sentences;
# We're processing a single line at a time.
while ( <$tag_corpus> ) { 
    # Test if we want to work with the line
    next unless m/$verb_regex/;
    # If we do, then test that we haven't dealt with it before
    # Although I suspect that this may not be needed as much if we're not 
    # pushing to a queue that we're reading from.
    next if    $seens{ $_ }++;

    # split -> split ' ', $_
    # pass through only those words that match _NN at the end and
    # are unique so far. We test on a substitution, because the result
    # still uniquely identifies a noun
    foreach my $noun ( grep { s/_NN$// && !$seenw{ $_ }++ } split ) { 
        print "**$noun**\n";
    }
    # This will omit any adjacent punctuation you have after the word--if 
    # that's a problem.
    print split( /_\S+/ ), "\n";
    # Here we save the sentence.
    push @sentences, $_;
}
close $tag_corpus or die "Can't close ch13tagged.txt: $!";
my@语句;
#我们一次只处理一行。
而{
#测试我们是否要使用该线路
下一个,除非m/$verb_regex/;
#如果我们这样做了,那么测试我们以前没有处理过它
#虽然我怀疑,如果我们不这样做的话,可能就不需要这么多了
#推到我们正在阅读的队列。
下一步如果$seens{$}++;
#拆分->拆分“”$_
#仅通过与结尾处匹配的单词,然后
#到目前为止都是独一无二的。我们测试了一个替代品,因为结果
#仍然唯一地标识一个名词
foreach my$名词(grep{s/_NN$/&&!$seenw{$}++}split){
打印“**$noon**\n”;
}
#这将省略单词后面的任何相邻标点符号——如果
#这是个问题。
打印拆分(/\us+/),“\n”;
#在这里,我们保存这个句子。
推送@句子,$\;
}
关闭$tag_语料库或死亡“无法关闭ch13taged.txt:$!”;

提供示例数据来处理。用它来指出你认为是“普通单词”、“标题”等。你的问题需要澄清。“下面列出的是标题”。?澄清标题并添加简短描述,最好是正常字体大小。没有必要将所有内容都塞进标题中。感谢深入的解决方案。我似乎无法让它打印出句子,但您将其保留为$SECTIONS[$I]
                print "**", split(/_\S+/, $words[$j]), "**", "\n";

                ## next if $seens{ $sentences[$i] }++; 
                ## push @sentences, $sentences[$i];
                print split(/_\S+/, $sentences[$i]), "\n"
                ## HOW PRINT bold or specifically word bold?
                #FILTER if word has been output, add sentence under that heading
            }
        }    ## put print sentences here to print each sentence after all the nouns inside
    }
}
close $tag_corpus || die "Can't close $tag_corpus: $!";
my @sentences;
# We're processing a single line at a time.
while ( <$tag_corpus> ) { 
    # Test if we want to work with the line
    next unless m/$verb_regex/;
    # If we do, then test that we haven't dealt with it before
    # Although I suspect that this may not be needed as much if we're not 
    # pushing to a queue that we're reading from.
    next if    $seens{ $_ }++;

    # split -> split ' ', $_
    # pass through only those words that match _NN at the end and
    # are unique so far. We test on a substitution, because the result
    # still uniquely identifies a noun
    foreach my $noun ( grep { s/_NN$// && !$seenw{ $_ }++ } split ) { 
        print "**$noun**\n";
    }
    # This will omit any adjacent punctuation you have after the word--if 
    # that's a problem.
    print split( /_\S+/ ), "\n";
    # Here we save the sentence.
    push @sentences, $_;
}
close $tag_corpus or die "Can't close ch13tagged.txt: $!";