Perl 如何打印带有特定参考号的范围值？_Perl_Match_Lines

Perl 如何打印带有特定参考号的范围值？

perl

Perl 如何打印带有特定参考号的范围值？,perl,match,lines,Perl,Match,Lines,我有一组数据文件，如下所示。我想通过参考2组数字范围（scoreA和scoreB）得到插值最终值（final，P）。比如说“Eric”，他的分数是35（范围30.00-40.00），分数是48（范围45.00-50.00）。他将得到两组最终值范围，即（22.88,40.90）和（26.99,38.99）。我想得到数据文件中“Eric”和“George”的最终值。“乔治”的得分A=38，得分B=26 经过公式计算，我想得到当他的分数A=35和分数B=45时的精确最终值。让我们假设公式是P=X+Y（

我有一组数据文件，如下所示。我想通过参考2组数字范围（scoreA和scoreB）得到插值最终值（final，P）。比如说“Eric”，他的分数是35（范围30.00-40.00），分数是48（范围45.00-50.00）。他将得到两组最终值范围，即（22.88,40.90）和（26.99,38.99）。我想得到数据文件中“Eric”和“George”的最终值。“乔治”的得分A=38，得分B=26

经过公式计算，我想得到当他的分数A=35和分数B=45时的精确最终值。让我们假设公式是P=X+Y（P是最终值），到目前为止，我一直在尝试下面所示的代码。但是，它无法获得正确的行

如何根据给定的数据得到准确的最终值范围

数据文件

Student_name ("Eric")   
/* This is a junk line */   
scoreA ("10.00, 20.00, 30.00, 40.00")  
scoreB ("15.00, 30.00, 45.00, 50.00, 55.00")     
final (  
"12.23,19.00,37.88,45.98,60.00",\  
"07.00,20.11,24.56,45.66,57.88",\  
"05.00,15.78,22.88,40.90,57.99",\  
"10.00,16.87,26.99,38.99,40.66"\)  

Student_name ("Liy") 
/* This is a junk line */   
scoreA ("5.00, 10.00, 20.00, 60.00")  
scoreB ("25.00, 30.00, 40.00, 55.00, 60.00")     
final (  
"02.23,15.00,37.88,45.98,70.00",\  
"10.00,28.11,34.56,45.66,57.88",\  
"08.00,19.78,32.88,40.90,57.66",\  
"10.00,27.87,39.99,59.99,78.66"\)

Student_name ("Frank") 
/* This is a junk line */   
scoreA ("2.00, 15.00, 25.00, 40.00")  
scoreB ("15.00, 24.00, 38.00, 45.00, 80.00")     
final (  
"02.23,15.00,37.88,45.98,70.00",\  
"10.00,28.11,34.56,45.66,57.88",\  
"08.00,19.78,32.88,40.90,57.66",\  
"10.00,27.87,39.99,59.99,78.66"\)

Student_name ("George") 
/* This is a junk line */   
scoreA ("10.00, 15.00, 20.00, 40.00")  
scoreB ("25.00, 33.00, 46.00, 55.00, 60.00")     
final (  
"10.23,25.00,37.88,45.98,68.00",\  
"09.00,28.11,34.56,45.66,60.88",\  
"18.00,19.78,32.88,40.90,79.66",\  
"17.00,27.87,40.99,59.99,66.66"\)

data();      
sub data() {   
    my $cnt = 0;
    while (my @array = <FILE>) {
        foreach $line(@array) {    
            if ($line =~ /Student_name/) {
                $a = $line;

                if ($a =~ /Eric/ or $cnt > 0 ) {
                    $cnt++;
                }
                if ( $cnt > 1 and $cnt <= 3 ) {
                    print $a;
                }
                if ( $cnt > 2 and $cnt <= 4 ) {
                    print $a;
                }
                if ( $cnt == 5 ) {
                    $cnt  =  0;  
                }
            }
        }
    }
}

Eric    final=42.66  
George  final=24.30

编码

Student_name ("Eric")   
/* This is a junk line */   
scoreA ("10.00, 20.00, 30.00, 40.00")  
scoreB ("15.00, 30.00, 45.00, 50.00, 55.00")     
final (  
"12.23,19.00,37.88,45.98,60.00",\  
"07.00,20.11,24.56,45.66,57.88",\  
"05.00,15.78,22.88,40.90,57.99",\  
"10.00,16.87,26.99,38.99,40.66"\)  

Student_name ("Liy") 
/* This is a junk line */   
scoreA ("5.00, 10.00, 20.00, 60.00")  
scoreB ("25.00, 30.00, 40.00, 55.00, 60.00")     
final (  
"02.23,15.00,37.88,45.98,70.00",\  
"10.00,28.11,34.56,45.66,57.88",\  
"08.00,19.78,32.88,40.90,57.66",\  
"10.00,27.87,39.99,59.99,78.66"\)

Student_name ("Frank") 
/* This is a junk line */   
scoreA ("2.00, 15.00, 25.00, 40.00")  
scoreB ("15.00, 24.00, 38.00, 45.00, 80.00")     
final (  
"02.23,15.00,37.88,45.98,70.00",\  
"10.00,28.11,34.56,45.66,57.88",\  
"08.00,19.78,32.88,40.90,57.66",\  
"10.00,27.87,39.99,59.99,78.66"\)

Student_name ("George") 
/* This is a junk line */   
scoreA ("10.00, 15.00, 20.00, 40.00")  
scoreB ("25.00, 33.00, 46.00, 55.00, 60.00")     
final (  
"10.23,25.00,37.88,45.98,68.00",\  
"09.00,28.11,34.56,45.66,60.88",\  
"18.00,19.78,32.88,40.90,79.66",\  
"17.00,27.87,40.99,59.99,66.66"\)

data();      
sub data() {   
    my $cnt = 0;
    while (my @array = <FILE>) {
        foreach $line(@array) {    
            if ($line =~ /Student_name/) {
                $a = $line;

                if ($a =~ /Eric/ or $cnt > 0 ) {
                    $cnt++;
                }
                if ( $cnt > 1 and $cnt <= 3 ) {
                    print $a;
                }
                if ( $cnt > 2 and $cnt <= 4 ) {
                    print $a;
                }
                if ( $cnt == 5 ) {
                    $cnt  =  0;  
                }
            }
        }
    }
}

Eric    final=42.66  
George  final=24.30

在我的评论中，我说解析相当容易。以下是如何做到这一点。由于该问题缺乏适当的文件格式规范，我将假设如下：

该文件由属性组成，这些属性具有以下值：

值是双引号字符串，包含以逗号分隔的数字或单个单词：

value ::= '"' ( word | number ("," number)* ) '"'

空格、反斜杠和注释是不相关的

这是一个可能的实现；我不会详细解释如何编写一个简单的解析器

package Parser;
use strict; use warnings;

sub parse {
  my ($data) = @_;

  # perform tokenization

  pos($data) = 0;
  my $length = length $data;
  my @tokens;
  while(pos($data) < $length) {
    next if $data =~ m{\G\s+}gc
         or $data =~ m{\G\\}gc
         or $data =~ m{\G/[*].*?[*]/}gc;
    if ($data =~ m/\G([",()])/gc) {
      push @tokens, [symbol => $1];
    } elsif ($data =~ m/\G([0-9]+[.][0-9]+)/gc) {
      push @tokens, [number => 0+$1];
    } elsif ($data =~ m/\G(\w+)/gc) {
      push @tokens, [word => $1];
    } else {
      die "unreckognized token at:\n", substr $data, pos($data), 10;
    }
  }

  return parse_document(\@tokens);
}

sub token_error {
  my ($token, $expected) = @_;
  return "Wrong token [@$token] when expecting [@$expected]";
}

sub parse_document {
  my ($tokens) = @_;
  my @properties;
  push @properties, parse_property($tokens) while @$tokens;
  return @properties;
}

sub parse_property {
  my ($tokens) = @_;
  $tokens->[0][0] eq "word"
    or die token_error $tokens->[0], ["word"];
  my $name = (shift @$tokens)->[1];
  $tokens->[0][0] eq "symbol" and $tokens->[0][1] eq '('
    or die token_error $tokens->[0], [symbol => '('];
  shift @$tokens;
  my @vals;
  VAL: {
    push @vals, parse_value($tokens);
    if ($tokens->[0][0] eq 'symbol' and $tokens->[0][1] eq ',') {
      shift @$tokens;
      redo VAL;
    }
  }
  $tokens->[0][0] eq "symbol" and $tokens->[0][1] eq ')'
    or die token_error $tokens->[0], [symbol => ')'];
  shift @$tokens;
  return [ $name => @vals ];
}

sub parse_value {
  my ($tokens) = @_;
  $tokens->[0][0] eq "symbol" and $tokens->[0][1] eq '"'
    or die token_error $tokens->[0], [symbol => '"'];
  shift @$tokens;

  my $value;

  if ($tokens->[0][0] eq "word") {
    $value = (shift @$tokens)->[1];
  } else {
    my @nums;
    NUM: {
      $tokens->[0][0] eq 'number'
        or die token_error $tokens->[0], ['number'];
      push @nums, (shift @$tokens)->[1];
      if ($tokens->[0][0] eq 'symbol' and $tokens->[0][1] eq ',') {
        shift @$tokens;
        redo NUM;
      }
    }
    $value = \@nums;
  }

  $tokens->[0][0] eq "symbol" and $tokens->[0][1] eq '"'
    or die token_error $tokens->[0], [symbol => '"'];
  shift @$tokens;

  return $value;
}

作为下一步，我们希望将其转换为嵌套哈希，以便获得结构

{
  Eric => {
    scoreA => [...],
    scoreB => [...],
    final  => [[...], ...],
  },
  Liy => {...},
  ...,
}

因此，我们只需通过以下小部件运行它：

sub properties_to_hash {
  my %hash;
  while(my $name_prop = shift @_) {
    $name_prop->[0] eq 'Student_name' or die "Expected Student_name property";
    my $name = $name_prop->[1];
    while( @_ and $_[0][0] ne 'Student_name') {
      my ($prop, @vals) = @{ shift @_ };
      if (@vals > 1) {
        $hash{$name}{$prop} = \@vals;
      } else {
        $hash{$name}{$prop} = $vals[0];
      }
    }
  }
  return \%hash;
}

所以我们有了主代码

my $data = properties_to_hash(Parser::parse( $file_contents ));

现在我们可以进入问题的第二部分：计算你的分数。也就是说，一旦你明确了你需要做什么

编辑：双线性插值设f为在某个坐标处返回值的函数。如果我们在这些坐标上有一个值，我们可以返回它。否则，我们将使用下一个已知值执行双线性插值

双线性插值的公式为：
现在，
scoreA
表示
final
表中数据点在第一轴上的位置，
scoreA
表示第二轴上的位置。我们必须做到以下几点：

断言请求的值
x，y
在范围内

获取下一个较小和下一个较大的位置

执行插值

现在，可以执行根据上述公式的插值，但是可以将已知索引处的每个访问更改为通过物理索引的访问，因此
f（x_1，y_2）
将变为

$final->[$x_i1][$y_i2]

子f的详细说明

sub f{…}
声明了一个名为
f
的sub，尽管这可能是个坏名字<代码>双线性插值可能是一个更好的名称

my（$data，$x，$y）=@
说明我们的sub接受三个参数：

$data
，一个散列引用，包含字段
scoreA
、
scoreB
和
final
，这些字段是数组引用

$x
，沿
scoreA
轴的位置，需要插值

$y
，沿
scoreB
轴的位置，需要插值

接下来，我们要断言
$x
和
$y
的位置在边界内是有效的。
$data->{scoreA}
中的第一个值是最小值；最大值位于最后一个位置（索引
-1
）。为了同时获得这两个数据，我们使用一个数组切片。切片一次访问多个值并返回一个列表，如
@array[1,2]
。因为我们使用的是使用引用的复杂数据结构，所以必须在
$data->{scoreA}
中取消对数组的引用。这使切片看起来像
@{$data->{scoreA}[0，1]
现在我们有了
$x_min
和
$x_max
值，我们抛出并出错，除非请求的值
$x
在最小/最大值定义的范围内。这是真的当

$x_min <= $x && $x <= $x_max
例如，可以抛出如下错误

indices (10, 500) out of range ([20, 30], [25, 57]) at script.pl line 42
在这里我们可以看到，
$x
的值太小，
$y
太大

下一个问题是找到邻近的值。假设
scoreA
保持
[1,2,3,4,5]
，并且
$x
是
3.7
，我们要选择
3
和
4
的值。但是，因为我们可以在稍后使用一些巧妙的技巧，我们宁愿记住相邻值的位置，而不是值本身。因此，在上面的示例中，这将给出
2
和
3
（请记住，箭头是以零为基础的）
我们可以通过循环数组的所有索引来实现这一点。当我们找到一个≤ <代码>$x，我们记得索引。例如，
3
是≤
$x
，因此我们记住了索引
2
。对于下一个更高的值，我们必须有点谨慎：显然，我们可以只取下一个索引，所以
2+1=3
。但是现在假设
$x
是
5
。这通过了边界检查。第一个值是≤
$x
将是值
5
，因此我们可以记住位置
4
。但是，位置
5
没有条目，因此我们可以使用当前索引本身。因为这将导致以后被零除，我们最好记住位置
3
和
4
（值
4
和
5
）
表示为代码，即

my ($x_i1, $x_i2); my @scoreA = @{ $data->{scoreA} }; # shortcut to the scoreA entry for my $i (0 .. $#scores) { # iterate over all indices: `$#arr` is the last idx of @arr if ($scores[$i] <= $x) { # do this if the current value is ≤ $x if ($i != $#scores) { # if this isn't the last index ($x_i1, $x_i2) = ($i, $i+1); } else { # so this is the last index ($x_i1, $x_i2) = ($i-1, $i); } last; # break out of the loop } }

现在我们有了所有周围的值
$x1、$x_2、$y_1、$y_2
，它们定义了要在其中执行双线性插值的矩形。数学公式很容易翻译成Perl：只需选择正确的运算符（
*
，而不是
·
）
die "indices ($x, $y) out of range ([$x_min, $x_max], [$y_min, $y_max])"

indices (10, 500) out of range ([20, 30], [25, 57]) at script.pl line 42

my ($x_i1, $x_i2); my @scoreA = @{ $data->{scoreA} }; # shortcut to the scoreA entry for my $i (0 .. $#scores) { # iterate over all indices: `$#arr` is the last idx of @arr if ($scores[$i] <= $x) { # do this if the current value is ≤ $x if ($i != $#scores) { # if this isn't the last index ($x_i1, $x_i2) = ($i, $i+1); } else { # so this is the last index ($x_i1, $x_i2) = ($i-1, $i); } last; # break out of the loop } }

my ($x_1, $x_2) = @{$data->{scoreA}}[$x_i1, $x_i2];