Perl脚本的改进建议？_Perl - Fatal编程技术网

Perl脚本的改进建议？

perl

Perl脚本的改进建议？,perl,Perl,我创建了一个Perl脚本，用于读取一个文件，其中包含一些数字，一个在另一个下面。我想消除重复项并将新列表保存到文件中。这是我的剧本： use strict; my $arg = "<abs path to>\\list.txt"; open (FH, "$arg") or die "\nError trying to open the file $arg : $!"; print "Opened File : $arg\n"; my $line = ""; my @lines =

我创建了一个Perl脚本，用于读取一个文件，其中包含一些数字，一个在另一个下面。我想消除重复项并将新列表保存到文件中。这是我的剧本：

use strict;

my $arg = "<abs path to>\\list.txt";
open (FH, "$arg") or die "\nError trying to open the file $arg : $!";
print "Opened File : $arg\n";
my $line = "";
my @lines = <FH>;
close FH;
my $temp;
my $count = 0;
my $check = 0;
my @list;
my $flag;

for $line (@lines)
{
    $count += 1;
    $check = $count;
    $flag = 1;
    for my $next (@lines)
    {
        $check -= 1;
        if($check < 0)
        {
            if ($line == $next)
            {
                $flag = 0;
            }
        }
    }

    if($flag == 1)
    {
        push (@list, $line);
    }
}

my $newarg = "<abs path to>\\new_list.txt";
open (FWH, ">>$newarg") or die "\nError trying to open the file $newarg for writing : $!";
my $size = @list;
print FWH "\n\n*** Size = $size ***\n\n";
for my $line (@list)
{
    print FWH "$line";
}

使用严格；
my$arg=“\\list.txt”；
打开（FH，“$arg”）或死亡“\n尝试打开文件$arg:$！”时出错；
打印“打开的文件：$arg\n”；
我的$line=“”；
我的@lines=；
关闭FH；
我的$temp；
我的$count=0；
我的美元支票=0；
我的@list；
我的$flag；
对于$line（@line）
{
$count+=1；
$check=$count；
$flag=1；
我的$next（@行）
{
$check-=1；
如果（$check<0）
{
如果（$line==$next）
{
$flag=0；
}
}
}
如果（$flag==1）
{
推送（@list，$line）；
}
}
my$newarg=“\\new\u list.txt”；
打开（FWH，“>>$newarg”）或死亡“\n尝试打开文件$newarg进行写入时出错：$！”；
我的$size=@list；
打印FWH“\n\n***大小=$Size***\n\n”；
对于我的$line（@list）
{
打印FWH“$line”；
}

我是一个尝试学习Perl的C++人。所以，你能给我推荐一些Perl中的API吗？它可能会减少脚本的大小。我希望脚本是可读的，简单易懂，因此间距。

谢谢。

为什么您不能简单地使用其他工具，如awk：

awk '!_[$0]++' your_file

在perl中还有一个实用程序，用于获取数组中的uniq元素：

use List::MoreUtils qw/ uniq /;
my @unique = uniq @lines;

如果您不想使用上述实用程序，您可以选择以下方法：

my %seen;
my @unique = grep { ! $seen{$_}++ } @faculty;

或者，您可以简单地使用以下函数获取uniq元素：

sub uniq {
    return keys %{{ map { $_ => 1 } @_ }};
}

将其称为：

uniq（@myarray）
您的编码风格没有太多需要添加的内容，只需阅读注释即可：
my $arg = "<abs path to>\\list.txt";

# Use lexical file handles and 3 argument form of open:
open my $FH, '<', $arg or die "\nError trying to open the file $arg : $!";
print "Opened File : $arg\n";

my @lines = <$FH>;
close $FH;

# Define each variable in the tightest scope possible.
my $count = 0;
my @list;

for my $line (@lines)
{
    $count += 1;
    my $check = $count;
    my $flag = 1;
    for my $next (@lines)
    {
        $check -= 1;
        if($check < 0)
        {
            if ($line == $next)
            {
                $flag = 0;
            }
        }
    }

    if ($flag == 1)
    {
        push @list, $line;
    }
}

my $newarg = "<abs path to>\\new_list.txt";
open my $FWH, '>>', $newarg or die "\nError trying to open the file $newarg for writing : $!";
my $size = @list;
print $FWH "\n\n*** Size = $size ***\n\n";
for my $line (@list)
{
    # Double quotes not needed if there is nothing to interpolate.
    print $FWH $line;
}
# You forgot to close the file. For output files, this is important.
close $FWH or die "\nCannot close $newarg: $!";

my$arg=“\\list.txt”；
#使用词汇文件句柄和open的3参数形式：
打开我的$FH，“>”，$newarg或die“\n试图打开文件$newarg进行写入时出错：$！”；
我的$size=@list；
打印$FWH“\n\n***大小=$Size***\n\n”；
对于我的$line（@list）
{
#如果没有要插值的内容，则不需要双引号。
打印$FWH$行；
}
#你忘了关闭文件。对于输出文件，这一点很重要。
关闭$FWH或die“\n无法关闭$newarg:$！”；

尽管如此，我还是要这样实现您的算法：
#!/usr/bin/perl
use warnings;
use strict;

my $input_file  = 'PATH/TO/FILE.TXT';
my $output_file = "$input_file.out";

open my $IN,  '<', $input_file  or die "Cannot open $input_file: $!\n";
open my $OUT, '>', $output_file or die "Cannot open $output_file: $!\n";

my $previous = 'inf';
while (my $line = <$IN>) {
    print $OUT $line if $previous != $line;
    $previous = $line;
}

close $OUT;

#/usr/bin/perl
使用警告；
严格使用；
my$input_file='PATH/TO/file.TXT'；
my$output_file=“$input_file.out”；
打开我的$IN、、$output_文件或die“无法打开$output_文件：$！\n”；
my$previous='inf'；
while（我的$line=）{
如果$previous！=$line，则打印$OUT$line；
$previous=$line；
}
收尾美元；
那么您有一个数字文件，您想在保留顺序的同时删除其中的重复项吗？这是Perl中的一行程序
perl -ne 'print unless $seen{$_}++' file > newfile

或：
如果您有不包含单个数字的行，或者如果您想要打印一些统计数据，或者如果您想要更好的参数处理，或者如果您注意到这不会对具有不同空格的数字进行重复数据消除，那么请继续并适当地更改它。例如：
# whitespace/non-numbers tolerant
perl -i.bak -ne 'if (/^\s*(\d+)\s*$/) { print unless $seen{$1}++ } else { print }'

作为脚本，键逻辑完全相同：
#! /usr/bin/env perl
use common::sense;
use autodie;

my $silent;
$silent = shift if (@ARGV > 0 and $ARGV[0] eq '-s');
die "usage: $0 [-s] src dest\n" unless @ARGV == 2;

open my $fi, '<', shift;
open my $fo, '>', shift;

my %seen;
while (<$fi>) {
  if (/^\s* (\d+) \s*$/x) {
    print {$fo} $_ unless $seen{$1}++;
    next;
  }
  print {$fo} $_;
}

unless ($silent) {
  say '-- de-dup stats --';
  say '-- $count $number --'
}
for (sort { $a <=> $b } keys %seen) {
  say "$seen{$_} $_"
}

每当你需要跟踪某件事时，想想哈希。散列有几个非常好的属性：

该密钥中只能存在一个：想象一下，如果您将所有数字存储在由该数字设置密钥的散列中。钥匙列表包含您的所有号码，没有重复的号码
快速键查找：假设您将数字存储在一个散列中，再次由数字键控。你以前见过那个号码吗？看看那把钥匙是否存在。快速，简单

这里有一个快速返工
#! /usr/bin/env perl
use strict;
use feature qw(say);
use warnings;
use autodie;

注意，我有使用警告
和使用严格的
。我告诉人们，使用strict
可以捕获他们90%的错误。嗯，使用警告
可以捕获另外9.99%的错误。警告是针对诸如试图打印出未定义的变量之类的内容，或者针对可能会给您带来麻烦的糟糕语法内容
允许您使用say
而不是print
。使用say
，NL包含在内，因此您不必一直使用\n
。听起来不多，但很好。如果无法打开文件，将自动终止程序。它将Perl变成了一种基于某种异常的语言。这样，如果您忘记测试某些东西，您的程序会让您知道
use constant {
    FILE         => '/path/to/file',
    OUTPUT       => '/path/to/output/file',
};

当你需要一些恒定的东西时，你应该使用它
open my $numfile_fh, "<", FILE;  #No need for die
open my $output_fh, ">", OUTPUT;
my %number_hash;
while ( my $number = <$numfile_fh> ) {
    chomp $number;   #Always chomp after you read
    if ( not exists $number_hash{$number} ) {
        $number_hash{$number} = 1;
        say $output_fh "$number";
    }
}
close $numfile_fh;
close $output_fh;

有些人会说这是编写循环逻辑的更好方法。在这种风格中，您将消除异常（重复的数字），然后处理默认情况（打印读入的数字并将其保存在哈希中）
请注意，所有这些实际上都没有改变列表的顺序。你读入一个数字，只要它不是重复的，你就按照你读入的顺序打印它。如果要对数字重新排序，以便对其进行排序，请使用两个循环：
while ( my $number = <$numfile_fh> ) {
    chomp $number;   #Always chomp after you read
     $number_hash{$number} = 1;
}

for my $number ( sort keys %number_hash ) {
    say $output_fh "$number";
}

while（我的$number=）{
咀嚼$number；#总是在阅读后咀嚼
$number_hash{$number}=1；
}
对于我的$number（排序键%number\u散列）{
说出$output_fh“$number”；
}

请注意，我并不费心测试数组中是否有数字。没有必要这样做，因为哈希值对每个值只能有一个键。
您能解释一下您的awk代码吗？你能解释一下你所有的方法吗？我对Perl…+1的键%{{..}
完全陌生，但是项目顺序会丢失，并且比@hash{@}=（）更混乱检查此awk解释：与对脚本进行反向工程相比，提供输入和所需输出更简单。脚本完全按照所需工作！我在问它是否可能更好。它可能更好，但级别也很低，需要更多的关注，而不是解释它的实际功能。我建议使用常量pragma。@friedo-Const:：Fast很好，因为它可以创建
open my $numfile_fh, "<", FILE;  #No need for die
open my $output_fh, ">", OUTPUT;
my %number_hash;
while ( my $number = <$numfile_fh> ) {
    chomp $number;   #Always chomp after you read
    if ( not exists $number_hash{$number} ) {
        $number_hash{$number} = 1;
        say $output_fh "$number";
    }
}
close $numfile_fh;
close $output_fh;

while ( my $number = <$numfile_fh> ) {
    chomp $number;   #Always chomp after you read
    next if exists $number_hash{$number};

    $number_hash{$number} = 1;
    say $output_fh "$number";
}

while ( my $number = <$numfile_fh> ) {
    chomp $number;   #Always chomp after you read
     $number_hash{$number} = 1;
}

for my $number ( sort keys %number_hash ) {
    say $output_fh "$number";
}