perl中的多数投票？_Perl - Fatal编程技术网

perl中的多数投票？

perl

perl中的多数投票？,perl,Perl,我有5个包含相同单词的文件。我想阅读所有文件中的每个单词，并通过检测由制表符分隔的单词（*、#、$、&）中的以下字符来确定获胜的单词。然后，我想生成一个输出文件。我只能有两名优胜者。例如：文件1 文件2 we$ are# ... 文件3 文件4 文件5 输出文件： we$ are*# 我是这样开始的： #!/usr/local/bin/perl -w sub read_file_line { my

我有5个包含相同单词的文件。我想阅读所有文件中的每个单词，并通过检测由制表符分隔的单词（*、#、$、&）中的以下字符来确定获胜的单词。然后，我想生成一个输出文件。我只能有两名优胜者。例如：

文件1

文件2

    we$
    are#
    ...

文件3

文件4

文件5

输出文件：

we$                       
are*#

我是这样开始的：

#!/usr/local/bin/perl -w

sub read_file_line {
  my $fh = shift;    
  if ($fh and my $line = <$fh>) {    
    chomp($line);    
    return $line;
  }    
  return;    
}

open(my $f1, "words1.txt") or die "Can't";
open(my $f2, "words2.txt") or die "Can't";
open(my $f3, "words3.txt") or die "Can't";
open(my $f4, "words4.txt") or die "Can't";
open(my $f5, "words5.txt") or die "Can't";

my $r1 = read_file_line($f1);
my $r2 = read_file_line($f2);
my $r3 = read_file_line($f3);
my $r4 = read_file_line($f4);
my $r5 = read_file_line($f5);

while ($f5) {

    #What can I do here to decide and write the winning word in the output file?

$r1 = read_file_line($f1);
$r2 = read_file_line($f2);
$r3 = read_file_line($f3);
$r4 = read_file_line($f4);
$r5 = read_file_line($f5);
}

#/usr/local/bin/perl-w
子读取文件行{
我的$fh=班次；
如果（$fh和我的$line=）{
chomp（$line）；
返回$line；
}    
返回；
}
打开（我的$f1，“words1.txt”）或死“不能”；
打开（我的$f2，“words2.txt”）或死“不能”；
打开（我的$f3，“words3.txt”）或死“不能”；
打开（我的$f4，“words4.txt”）或死“不能”；
打开（我的$f5，“words5.txt”）或死“不能”；
my$r1=读取文件行（$f1）；
my$r2=读取文件行（$f2）；
my$r3=读取文件行（$f3）；
my$r4=读取文件行（$f4）；
my$r5=读取文件行（$f5）；
而（f5美元）{
#我可以在这里做什么来决定并在输出文件中写入获奖单词？
$r1=读取文件行（$f1）；
$r2=读取文件行（$f2）；
$r3=读取文件行（$f3）；
$r4=读取文件行（$f4）；
$r5=读取文件行（$f5）；
}

听起来像是一个年轻人的工作。未测试代码：

use strict;
use warnings;
use 5.010;
use autodie;
use List::Util qw( sum reduce );

my %totals;

my @files = map "words$_.txt", 1..5;

for my $file (@files) {
    open my $fh, '<', $file;
    while (<$fh>) {
        chomp;
        my ($word, $sign) = /(\w+)(\W)/;
        $totals{$word}{$sign}++;
    }
}

open my $totals_fh, '>', 'outfile.txt';

my @sorted_words = sort { sum values %{$totals{$a}} <=> sum values %{$totals{$b}} } keys %totals; #Probably something fancier here.

for my $word (@sorted_words[0, 1]) {
    #say {$totals_fh} $word, join('', keys %{$totals{$word}} ), "\t- ", function_to_decide_text($totals{$word});
    say {$totals_fh} $word, reduce {
            $totals{$word}{ substr $a, 0, 1 } == $totals{$word}{$b} ? $a . $b
          : $totals{$word}{ substr $a, 0, 1 } > $totals{$word}{$b} ? $a
          :                                                          $b;
    } keys %{ $totals{$word} };
}

使用严格；
使用警告；
使用5.010；
使用自动模具；
使用列表：：Util qw（总和减少）；
我的百分比总数；
my@files=map“words$\ txt”，1..5；
对于我的$file（@files）{
打开我的$fh，，'outfile.txt'；
我的@sorted_words=sort{sum values%{$totals{$a}}sum values%{$totals{$b}}}键为%totals；#这里可能有更奇特的东西。
对于我的$word（@sorted_words[0,1]）{
#说{$totals\u fh}$word，连接（''，键%{$totals{$word}），“\t-”，函数决定文本（$totals{$word}）；
说{$totals\u fh}$word，减少{
$totals{$word}{substr$a，0，1}==$totals{$word}{$b}？$a.$b
：$totals{$word}{substr$a，0，1}>$totals{$word}{$b}？$a
：$b；
}键%{$totals{$word}}；
}

编辑：忘记了仅有的两个优胜者部分。有点修正了

EDIT2：根据注释修复。

#！/usr/bin/perl
#!/usr/bin/perl

use strict;
use warnings;

my @files   = qw(file1 file2 file3 file4 file5);
my $symbols = '*#$&'; # no need to escape them as they'll be in a character class
my %words;

foreach my $file (@files) {
   open(my $fh, '<', $file) or die "Cannot open $file: $!";
   while (<$fh>) {
      if (/^(\w+[$symbols])$/) {
         $words{$1} ++; # count the occurrences of each word
      }
   }
   close $fh;
}

my $counter  = 0;
my $previous = -1;

foreach my $word (sort {$words{$b} <=> $words{$a}} keys %words) {

   # make sure you don't exit if two words at the top of the list 
   # have the same number of occurrences
   if ($previous != $words{$word}) {
      last if $counter > 1;
   }
   $counter ++; # count the output
   $previous = $words{$word};

   print "$word occurred $words{$word} times.\n";
}

严格使用；
使用警告；
my@files=qw（file1 file2 file3 file4 file5）；
我的$symbols='*#$&'#不需要转义它们，因为它们将在字符类中
我的%字；
foreach my$文件（@files）{
打开（我的$fh，'1；
}
$counter++#计算输出
$previous=$words{$word}；
打印“$word出现$words{$word}次。\n”；
}

在我试用时工作…

测试数据生成器多数表决代码对于生成的文件中的测试数据，这似乎是正确的

修订要求-输出示例 “修订要求”用制表符和其中一个字母“ABCD”替换了单词后的“*#$&”标记。经过快速协商，问题恢复为原始形式。此输出来自上述答案的适当修改版本-3个代码行发生了更改，2个在数据生成器中，1个在多数选民中。这些更改图中未显示s-它们是微不足道的

we      C       we      D       we      C       we      C       we      D       we      C
are     C       are     D       are     C       are     B       are     A       are     C
the     B       the     D       the     A       the     A       the     D       the     A|D
people  D       people  B       people  A       people  B       people  D       people  B|D
in      D       in      B       in      C       in      B       in      D       in      D|B
charge  C       charge  D       charge  D       charge  D       charge  A       charge  D
and     A       and     B       and     C       and     C       and     B       and     B|C
what    B       what    B       what    B       what    C       what    C       what    B
we      D       we      B       we      D       we      B       we      A       we      B|D
say     D       say     D       say     B       say     D       say     D       say     D
goes    A       goes    C       goes    A       goes    C       goes    A       goes    A

修订的测试生成器-用于可配置的文件数现在，海报已经解决了如何处理修改后的场景，这是我使用的数据生成器代码-带有5个标记（A-E）。显然，在命令行上配置标记的数量不会花费大量的工作

#!/usr/bin/env perl

use strict;
use warnings;

my $fmax  = scalar(@ARGV) > 0 ? $ARGV[0] : 5;
my $tags  = 'ABCDE';
my $ntags = length($tags);
my $fmt   = sprintf "words$fmax-%%0%0dd.txt", length($fmax);

foreach my $fnum (1..$fmax)
{
    my $file = sprintf $fmt, $fnum;
    open my $fh, '>', $file or die "Failed to open $file for writing ($!)";
    foreach my $w (qw(We Are The People In Charge And What We Say Goes))
    {
        my $suffix = substr($tags, rand($ntags), 1);
        print $fh "$w\t$suffix\n";
    }
}

修订的多数表决代码-适用于任意数量的文件这段代码基本上可以处理任意数量的文件。正如（许多）注释中所述，它不会按照问题的要求检查每个文件中的单词是否相同；如果单词不相同，则可能会得到奇怪的结果

#!/usr/bin/env perl

use strict;
use warnings;

my @files = scalar @ARGV > 0 ? @ARGV :
            ( "words1.txt", "words2.txt", "words3.txt",
              "words4.txt", "words5.txt"
            );
my $voters = scalar(@files);

my @fh;
{
    my $n = 0;
    foreach my $file (@files)
    {
        open my $f, '<', $file or die "Can't open $file for reading ($!)";
        $fh[$n++] = $f;
    }
}

while (my $r = process_line(@fh))
{
    print "$r\n";
}

sub process_line
{
    my(@fhlist) = @_;
    my %words = ();
    foreach my $fh (@fhlist)
    {
        my $line = <$fh>;
        return unless defined $line;
        chomp $line;
        $words{$line}++;
    }
    return winner(%words);
}

# Get tag X from entry "word\tX".
sub get_tag_from_word
{
    my($word) = @_;
    return (split /\s/, $word)[1];
}

sub winner
{
    my(%words)   = @_;
    my $maxscore = 0;
    my $winscore = ($voters / 2) + 1;
    my $winner   = '';
    my $taglist  = '';
    foreach my $word (sort keys %words)
    {
        return "$word\t$words{$word}" if ($words{$word} >= $winscore);
        if ($words{$word} > $maxscore)
        {
            $winner = $word;
            $winner =~ s/\t.//;
            $taglist = get_tag_from_word($word);
            $maxscore = $words{$word};
        }
        elsif ($words{$word} == $maxscore)
        {
            my $newtag = get_tag_from_word($word);
            $taglist .= "|$newtag";
        }
    }
    return "$winner\t$taglist\t$maxscore";
}

第一列是单词；第二列是获胜的标记；第三列（数字）是最高分数；其余10列是10个数据文件中的标记。正如您所见，第一行中有两个“We A”、“We B”、“We E”。我还生成了（但未保留）一个结果集的最高分数是7。如果重复次数足够多，这些变化是可以找到的。

气味，你知道那种气味。整个董事会。闻起来像……家庭作业。嘿-修订后的问题是前一个问题的一个小变化；你应该能够根据新的情况调整以前的任何解决方案。还有changi把问题弄得与前面的答案都不相关是不符合犹太教的。因此，当问题显然是家庭作业时，不把它标记为家庭作业也是不符合犹太教的。那么，不考虑你修改后的要求就完全是你的懒惰。我真的很抱歉，我改变了问题，让你明白第二种方法是什么，不让它变得不相关levant.这是我为了寻求你的帮助而让stack参与的一个项目。很抱歉。@aliocee:好的-你已经学会了。以后请记住。谢谢！@aliocee:。你不是在散列中存储行，只是每个单词的计数及其符号；数据结构看起来像{are=>{$'=>10'，&=>1}，我们=>{'$'=>1'，#'=>11}；所以散列很有可能远没有那么大。好吧，刚刚测试过它，它似乎运行得很好（用两个结果和所有结果生成输出文件）——不过，你必须自己定义函数来决定文本。如果你没有更改$word部分的值，那么错误应该在@sorted\u words[0,1]中-它没有被填满。请尝试在for之前添加use Data:：Dumper；说Dumper\@sorted\u words；@aliocee，没错。刚刚编辑的版本符合您的要求，尽管我肯定我在这方面做得很糟糕。@aliocee：新的要求，嗯。要使这个工作正常，您需要将正则表达式更改为/（\p{ll}）\t（\p{Lu}/，然后更改reduce中字符串连接的工作方式。不过，您必须自己解决这个问题。谢谢，但我只想创建与输入文件完全相同的文件，其中只包含获胜的单词。但是当我有两个获奖者时，我必须

#!/usr/bin/perl

use strict;
use warnings;

my @files   = qw(file1 file2 file3 file4 file5);
my $symbols = '*#$&'; # no need to escape them as they'll be in a character class
my %words;

foreach my $file (@files) {
   open(my $fh, '<', $file) or die "Cannot open $file: $!";
   while (<$fh>) {
      if (/^(\w+[$symbols])$/) {
         $words{$1} ++; # count the occurrences of each word
      }
   }
   close $fh;
}

my $counter  = 0;
my $previous = -1;

foreach my $word (sort {$words{$b} <=> $words{$a}} keys %words) {

   # make sure you don't exit if two words at the top of the list 
   # have the same number of occurrences
   if ($previous != $words{$word}) {
      last if $counter > 1;
   }
   $counter ++; # count the output
   $previous = $words{$word};

   print "$word occurred $words{$word} times.\n";
}

#!/usr/bin/env perl

use strict;
use warnings;

foreach my $i (1..5)
{
    my $file = "words$i.txt";
    open my $fh, '>', $file or die "Failed to open $file for writing ($!)";
    foreach my $w (qw (we are the people in charge and what we say goes))
    {
        my $suffix = substr('*#$&', rand(4), 1);
        print $fh "$w$suffix\n";
    }
}

#!/usr/bin/env perl

use strict;
use warnings;

my @files = ( "words1.txt", "words2.txt", "words3.txt",
              "words4.txt", "words5.txt"
            );

my @fh;
{
    my $n = 0;
    foreach my $file (@files)
    {
        open my $f, '<', $file or die "Can't open $file for reading ($!)";
        $fh[$n++] = $f;
    }
}

while (my $r = process_line(@fh))
{
    print "$r\n";
}

sub process_line
{
    my(@fhlist) = @_;
    my %words = ();
    foreach my $fh (@fhlist)
    {
        my $line = <$fh>;
        return unless defined $line;
        chomp $line;
        $words{$line}++;
    }

    my $combo = '';
    foreach my $word (keys %words)
    {
        return $word    if ($words{$word} >  2);
        $combo .= $word if ($words{$word} == 2);
    }
    $combo =~ s/(\W)\w+(\W)/$1$2/;
    return $combo;
}

$ perl datagenerator.pl
$ perl majorityvoter.pl > results.txt
$ paste words?.txt results.txt
we*     we$     we&     we#     we#     we#
are*    are#    are#    are*    are$    are*#
the*    the&    the#    the#    the&    the&#
people& people& people$ people# people# people&#
in#     in*     in$     in*     in*     in*
charge* charge# charge& charge* charge# charge#*
and$    and*    and$    and&    and$    and$
what&   what&   what$   what&   what#   what&
we#     we*     we*     we&     we*     we*
say$    say&    say$    say$    say$    say$
goes$   goes&   goes#   goes#   goes#   goes#
$

we      C       we      D       we      C       we      C       we      D       we      C
are     C       are     D       are     C       are     B       are     A       are     C
the     B       the     D       the     A       the     A       the     D       the     A|D
people  D       people  B       people  A       people  B       people  D       people  B|D
in      D       in      B       in      C       in      B       in      D       in      D|B
charge  C       charge  D       charge  D       charge  D       charge  A       charge  D
and     A       and     B       and     C       and     C       and     B       and     B|C
what    B       what    B       what    B       what    C       what    C       what    B
we      D       we      B       we      D       we      B       we      A       we      B|D
say     D       say     D       say     B       say     D       say     D       say     D
goes    A       goes    C       goes    A       goes    C       goes    A       goes    A

#!/usr/bin/env perl

use strict;
use warnings;

my $fmax  = scalar(@ARGV) > 0 ? $ARGV[0] : 5;
my $tags  = 'ABCDE';
my $ntags = length($tags);
my $fmt   = sprintf "words$fmax-%%0%0dd.txt", length($fmax);

foreach my $fnum (1..$fmax)
{
    my $file = sprintf $fmt, $fnum;
    open my $fh, '>', $file or die "Failed to open $file for writing ($!)";
    foreach my $w (qw(We Are The People In Charge And What We Say Goes))
    {
        my $suffix = substr($tags, rand($ntags), 1);
        print $fh "$w\t$suffix\n";
    }
}

#!/usr/bin/env perl

use strict;
use warnings;

my @files = scalar @ARGV > 0 ? @ARGV :
            ( "words1.txt", "words2.txt", "words3.txt",
              "words4.txt", "words5.txt"
            );
my $voters = scalar(@files);

my @fh;
{
    my $n = 0;
    foreach my $file (@files)
    {
        open my $f, '<', $file or die "Can't open $file for reading ($!)";
        $fh[$n++] = $f;
    }
}

while (my $r = process_line(@fh))
{
    print "$r\n";
}

sub process_line
{
    my(@fhlist) = @_;
    my %words = ();
    foreach my $fh (@fhlist)
    {
        my $line = <$fh>;
        return unless defined $line;
        chomp $line;
        $words{$line}++;
    }
    return winner(%words);
}

# Get tag X from entry "word\tX".
sub get_tag_from_word
{
    my($word) = @_;
    return (split /\s/, $word)[1];
}

sub winner
{
    my(%words)   = @_;
    my $maxscore = 0;
    my $winscore = ($voters / 2) + 1;
    my $winner   = '';
    my $taglist  = '';
    foreach my $word (sort keys %words)
    {
        return "$word\t$words{$word}" if ($words{$word} >= $winscore);
        if ($words{$word} > $maxscore)
        {
            $winner = $word;
            $winner =~ s/\t.//;
            $taglist = get_tag_from_word($word);
            $maxscore = $words{$word};
        }
        elsif ($words{$word} == $maxscore)
        {
            my $newtag = get_tag_from_word($word);
            $taglist .= "|$newtag";
        }
    }
    return "$winner\t$taglist\t$maxscore";
}

We          A|B|C|D|E   2  B  C  C  E  D  A  D  A  E  B
Are         D           4  C  D  B  A  D  B  D  D  B  E
The         A           5  D  A  B  B  A  A  B  E  A  A
People      D           4  E  D  C  D  B  E  D  D  B  C
In          D           3  E  C  D  D  D  B  C  A  A  B
Charge      A|E         3  E  E  D  A  D  A  B  A  E  B
And         E           3  C  E  D  D  C  A  B  E  B  E
What        A           5  B  C  C  A  A  A  B  A  D  A
We          A           4  C  A  A  E  A  E  C  D  A  E
Say         A|D         4  A  C  A  A  D  E  D  A  D  D
Goes        A           3  D  B  A  C  C  A  A  E  E  B