Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/perl/9.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/logging/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Perl中将JUniter防火墙日志解析为文本文件_Perl_Logging - Fatal编程技术网

在Perl中将JUniter防火墙日志解析为文本文件

在Perl中将JUniter防火墙日志解析为文本文件,perl,logging,Perl,Logging,我是Perl和编程新手。我对在Unix中编写shell脚本的接触有限,并且一直在使用Camel book Programming Perl、第三版以及我在web上遇到的各种Perl教程。我试图获取我们的Juniper防火墙每晚创建的日志文件,并创建一个关于VPN会话的报告,以供研究之用。我正在编写和修改一个脚本,该脚本将读取日志文件,从日志的每一行解析出几个变量,并将报告输出到如下格式的文本文件: UserID DHCP Logon Timeout Maxsession

我是Perl和编程新手。我对在Unix中编写shell脚本的接触有限,并且一直在使用Camel book Programming Perl、第三版以及我在web上遇到的各种Perl教程。我试图获取我们的Juniper防火墙每晚创建的日志文件,并创建一个关于VPN会话的报告,以供研究之用。我正在编写和修改一个脚本,该脚本将读取日志文件,从日志的每一行解析出几个变量,并将报告输出到如下格式的文本文件:

UserID DHCP          Logon    Timeout  Maxsession Logout   Closed   Duration
User1  xxx.xx.xxx.xx 06:23:47                     06:20:45 06:20:45 00:14:33
User2  xxx.xx.xxx.xx 08:01:59          16:01:59            16:01:59 00:57:27
User3  xxx.xx.xxx.xx 09:04:20 09:14:20                     09:14:24 00:10:00
User1  xxx.xx.xxx.xx 17:01:01                     18:05:01 18:05:01 01:04:00

The three cases I am interested in capturing are:
 1. User logs in, user logs out
 2. User logs in, user times out
 3. User logs in, max session reached user times out
我不确定如何处理时间戳,以获得一些活动未提供的持续时间。有时会提供会话持续时间,但对于不提供会话持续时间的事件,我需要找出如何规范化适用的时间戳,并进行计算以获得它。非常感谢您的任何想法或建议,谢谢

行动:

当用户登录时,会在日志文件中生成以下行:

11月30日09:02:45 100.10.10.100 Juniper:2014-12-08 09:02:02-ive-[101.10.100.10]域\usermyRealm[myRole]-VPN隧道:已为IPv4地址为100.11.11.123、主机名为userHostName的用户启动会话

用户注销时,会在日志文件中生成以下行:

11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22-ive-[10.1.100.100]用户1VPN1[vpn1]-从100.10.10.100注销会话:12345678

11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22-ive-[10.1.100.100]用户1VPN1[]在1234秒后关闭到100.10.10.1的连接,读取1234567字节,写入123456789字节

当用户超时时,会在日志文件中生成以下行:

11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22-ive-[10.1.100.100]用户1VPN1[]在1234秒后关闭到100.10.10.1的连接,读取1234567字节,写入123456789字节 11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22-ive-[10.1.100.100]user1vpn1[vpn1]-用户/vpn1会话的会话超时:00000000由于不活动,最后一次访问时间为2014/11/30年13:43:20

当用户达到最大会话超时时,将在日志文件中生成以下行:

11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22-ive-[10.1.100.100]用户1VPN1[vpn1]-用户/vpn1会话的最大会话超时:00000000 11月30日14:30:52 100.10.10.100 Juniper:2014-11-30 14:30:22-ive-[10.1.100.100]用户1VPN1[]在1234秒后关闭到100.10.10.1的连接,读取1234567字节,写入123456789字节

到目前为止,我的代码是:


如果您正在处理大量的日志,那么这可能会成为性能问题,而不是多次调用正则表达式匹配引擎。如果您首先使用连字符作为分隔符拆分每个日志行,那么似乎有很多好处。拆分在这里是一个不错的选择。希望从那时起,只在相关的子字符串上运行正则表达式匹配更有效


至于在不一定相邻的行之间维护上下文,我假设您将使用IP地址作为会话的标识符。哈希对象可能足以作为第一行选项,用于跟踪您在哪些IP上看到的消息。

使用哈希保存起始信息,然后在收到结束行时访问它

例如,当您阅读每一行时,我会解析日期并检查Junpier:立即抛出所有不符合一般标准的行。您可能还可以同时检查关键短语,然后在循环中对每个情况进行所需的特定处理:

my %starts;

while (defined(my $line = <LOGREPORT>)) {
    if ($line =~ s/^... (?:[\d ]\d) \d\d:\d\d:\d\d \S+ Junpier: (\d{4}-\d\d-\d\d \d\d:\d\d:\d\d) - \S+ - \S+ (\S+) \S+ - (Primary authentication successful|Logout|Closed Connection|Session timed out|Max Session time out)//) {
        my($time, $vpn, $state) = ($1, $2, $3);

        # TODO Normalize $time here if you wish

        if ($state eq 'Primary authentication successful') {
            $starts{$vpn} = $time;

        } elsif (defined(my $start = delete $starts{$time})) {
            # TODO Process other information needed from the line and output one line...
            # TODO Also you can use the normalized ($time - $start) for your duration if it isn't available on the rest of the line.

        } else {
            warn "No 'Primary authentication successful' found for: $vpn\n";
        }
    }
}
如果你能用一个大的正则表达式和其他简单的正则表达式来实现,那么它会运行得很快。
当然,这里的效率真的很重要吗?

为了提高编码效率,我会使用正则表达式来隔离每种类型的行所需的字段。我假设报告需要每天生成几次,并且执行速度不是问题

我将使用散列作为我的数据结构。第一个散列的键是userid。第一个has的值是对第二个散列的引用。第二个散列的密钥是操作身份验证、注销、会话关闭、超时、maxSession等。该值是操作的时间戳。这将把一个特定用户的所有数据合并到一个数据结构中。我还假设计算机中有足够的RAM来处理内存中的所有数据。我还假设只需要操作的时间戳

在捕获所有数据后,我将通过按排序的userId顺序访问散列来生成报告,以生成报告的每一行

还有两个想法:

我想考虑一下日期和时间。 捕获午夜前正在进行身份验证并在午夜后结束会话的用户。我会将日期时间信息存储为自历元起的秒数,以便更轻松地计算持续时间

此外,我将确定日志文件是否包含GMT时间或本地时间。您需要此信息才能正确处理夏令时更改


希望这能有所帮助。

下面的代码将完成我最初试图用脚本完成的任务。我学到了很多关于Perl做这件事的知识,并感谢大家的反馈,即使我没有将其合并。非常感谢

#!/usr/bin/perl
use warnings;
use strict;

#This script convert the specified log file to a report showing each user's ID, DHCP Address, Logon time,
#Logout time, Timeout time, and Maxtimout time. 

#Arrays needed for script
my @fields;
my @user;
my @dhcp;
my @login;
my @logout;
my @close;
my @timeout;
my @maxtimeout;

#Scalars needed for script
my $localtime = localtime();
my $input = '/home/user/bin/Temp/log.txt';
my $output = '>/home/user/bin/Temp/vpnreport.txt';
my $line;
my $fields;
my $userid;
my $jdate;
my $jtime;
my $dhcpaddr;
my $srcaddr;
my $sessionid;
my $sessiondur;
my $lastacctime;
my $lastaccdate;
my $bytesr;
my $bytesw;
my $timestamp;
my $maxrow = 0;
my $currow = 0;
my $i = 0;

#Open the log file
open (VPNLOG, $input) or die "Unable to open the input file:$!\n";

#Open the file(s) to be written to in clobber mode
open (VPNREPORT, $output) or die "Unable to open the output file:$!\n";

#Setup to while loop to process each line
while ($line = <VPNLOG>) {
chomp $line; #Remove the line breaks

#Strip the log's timestamp and IP
$line =~ s/.*Juniper:\s(.*)$/$1/;

#If line contains "Administrators" or "(Admin Users)" ignore it and move on to the next line
unless ($line =~ m/Administrators|(Admin Users)|System()/) {
#Split the line into the @fields array on every " " encountered
@fields = split (/ /, $line);
$jdate = $fields[0];                     #Juniper datestamp
$jdate =~ s/-//g;                        #Remove any occurance of "-" from the date stamp
$jtime = $fields[1];                     #Juniper timestamp
$userid = $fields[6];                    #User ID
$userid =~ s/XXXXXXX.|\(.*\)\[(.*)\]//g; #Remove the "XXXXXXX\" preceding the username and the "(Realm)[Role    ]"
                                     #trailing the username
#Normalize and recombine jtime and jdate here:
$timestamp = "$jdate $jtime";
#Check to see if line contains string "VPN Tunneling: Session started for user"

if ($line =~ m/VPN Tunneling: Session started for user/) {
++$maxrow;
$dhcpaddr = $fields[17];          #Destination IP address
$dhcpaddr =~ s/,//g;              #Remove "," trailing the IP address
$user[$maxrow] = $userid;
$dhcp[$maxrow] = $dhcpaddr;
$login[$maxrow] = $timestamp;
$logout[$maxrow] = "--";
$close[$maxrow] = "--";
$timeout[$maxrow] = "--";
$maxtimeout[$maxrow] = "--";
}

elsif ($line =~m/Logout/) {
$dhcpaddr = $fields[10];           #DHCP IP address
$sessionid = $fields[11];          #Session ID
$sessionid =~ s/\(session:|\)//g;   #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
    if ($user[$currow] eq $userid and $logout[$currow] eq "--") {
        $logout[$currow] = $timestamp;
        last;
    }
  }
}

elsif ($line =~m/Closed connection/) {
$dhcpaddr = $fields[11];         #DHCP IP Address
$sessiondur = $fields[13];       #Duration of session in seconds
$bytesr = $fields[16];           #Bytes read
$bytesw = $fields[20];           #Bytes written
for ($currow = $maxrow; $currow >= 1; $currow--) {
     if ($user[$currow] eq $userid and $close[$currow] eq "--") {
         $close[$currow] = $timestamp;
         last;
     }
   }
}

elsif ($line =~m/Session timed out/) {
$sessionid = $fields[13];          #Session ID
$sessionid =~ s/\(session:|\)//g;  #Remove the "(session:" and ")" from the session ID
$lastacctime = $fields[20];        #Last accessed time
$lastaccdate = $fields[21];        #Last accessed date
$lastaccdate =~ s/\).//g;          #Remove the ")" from the last access date
for ($currow = $maxrow; $currow >= 1; $currow--) {
   if ($user[$currow] eq $userid and $timeout[$currow] eq "--") {
       $timeout[$currow] = $timestamp;
       last;
     }
   }
}

elsif ($line =~m/Max session timeout/) {
$sessionid = $fields[13];          #Session ID
$sessionid =~ s/\(session:|\).//g; #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
     if ($user[$currow] eq $userid and $maxtimeout[$currow] eq "--") {
         $maxtimeout[$currow] = $timestamp;
         last;
    }
  }
}

}
}

#Define the format then output file(s) using printf
#Print the Column headers: UserID, Logon, Logout, Timeout, Maxtimout, Close, Duration
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", "UserID", "DHCP", "Logon ", "Logout", "Timeout", "Maxtimout", "Close stamp");
print VPNREPORT "-------------------------------------------------------------------------------------------    ----------------------------\n";

#Newest record at top of report
#for ($i = $maxrow; $i >= 1; $i--) {

#Oldest record at top of report
for ($i = 0; $i <= $maxrow; $i++) {
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", $user[$i], $dhcp[$i], $login[$i], $logout[$    i], $timeout[$i], $maxtimeout[$i], $close[$i]);
}

#Close the input and output files
close (VPNLOG);
close (VPNREPORT);

请不要编写PERL,因为根据您的链接,实际上应该是PERL…我将按照您的建议研究split命令。我预见到的使用IP地址作为会话标识符的问题是,当用户注销vpn,然后再次登录时,他们将拥有与上次会话相同的IP。建议从日志中创建我自己的会话标识符,还是在发现主身份验证成功后创建一个会话标识符?就像在用户名后创建一个变量,然后在找到会话注销/timeout/maxtimeout后使用if stations自动递增变量,在创建新行之前检查并查看值是否存在一样?生成一个唯一的会话密钥(例如,连接到IP的某种salt)可能是一种简单的方法。检查散列中是否存在特定密钥的一种非常简单的方法是使用exists函数。我从未想过这一点,在我完成这个脚本时,我会一直考虑这一点:事实证明,我不需要散列来完成这项工作,尽管它会很有帮助。有一个会话ID,但它只在超时和注销/关闭事件时报告,而不在会话启动时报告:这似乎是个好主意。我要试一试。目前效率不是很关键,因为日志文件相当小,并且该报告每天只运行一次,但在未来,由于添加了更多用户,日志文件的大小将增加,因此我想学习一种更有效的方法来解析字符串元素,而不是执行大量的正则表达式匹配命令。如果您不介意按结束时间而不是开始时间输出,您可以在解析结束行时对输出信息进行一次传递解析。如果您希望在开始时间之前完成,您可以从外部对输出进行排序,也可以麻烦地收集内存中的所有信息,对其进行排序,然后输出。您还可以考虑输出CSV并将其加载到电子表格中以便于操作输出。我所面临的问题是,对于每种类型的事件,日志字符串的长度和输出都是不同的。另一个问题是日志中的所有行都包含Juniper之前的时间戳和日期:因此将它们丢弃没有多大意义。不确定我是否解释得足够好,但我需要将所有字符串视为每种类型事件的可变长度。Reg-ex匹配可能确实是我唯一的选择?事实上,在我看来,左边的部分登录到关键字似乎相当统一,可能是因为我看到了更糟糕的情况;我的建议是先将其解析掉,然后对正确的部分使用unquie regex语句。为了简化这个过程,您可以像在原始代码中那样从多个独立的正则表达式语句开始,但可以增强每个正则表达式以解析整行。一旦你让它工作,确定它是否足够快。如果不是的话,你可以把所有的正则表达式组合成一个,可能是结合了一些常见的片段,可能是一些猴子的业务来提高速度。我现在明白你的意思了,我会在之前给出:谢谢你的这些想法。我没有考虑过你在这里提到的一些事情,特别是时间组件。这看起来像什么代码,在我当前的代码中应该放在哪里?我想我正在努力理解如何将数据逐行输入散列。谷歌搜索会给出许多合适的点击率。例如,似乎与您的需求相关。我昨晚实际上正在查看确切的页面,问题是他们的数据都是固定的,如果我用标量代替字符串,它是否仍能正确填充?此外,他们给出了一个如何填充哈希结构的示例,它会取代我当前的拆分代码还是在它之前/之后?提前感谢:一旦隔离了填充数据结构所需的信息,该结构就不再关心数据源,只关心数据值。拆分将首先获取信息并填充标量。然后使用标量代替字符串来填充数据结构。我建议您先模拟处理纸上的一行文件,以自学正确的操作顺序。此模拟应涵盖从读取文件到打印报告的所有步骤,包括报告标题(如果有)。
#!/usr/bin/perl
use warnings;
use strict;

#This script convert the specified log file to a report showing each user's ID, DHCP Address, Logon time,
#Logout time, Timeout time, and Maxtimout time. 

#Arrays needed for script
my @fields;
my @user;
my @dhcp;
my @login;
my @logout;
my @close;
my @timeout;
my @maxtimeout;

#Scalars needed for script
my $localtime = localtime();
my $input = '/home/user/bin/Temp/log.txt';
my $output = '>/home/user/bin/Temp/vpnreport.txt';
my $line;
my $fields;
my $userid;
my $jdate;
my $jtime;
my $dhcpaddr;
my $srcaddr;
my $sessionid;
my $sessiondur;
my $lastacctime;
my $lastaccdate;
my $bytesr;
my $bytesw;
my $timestamp;
my $maxrow = 0;
my $currow = 0;
my $i = 0;

#Open the log file
open (VPNLOG, $input) or die "Unable to open the input file:$!\n";

#Open the file(s) to be written to in clobber mode
open (VPNREPORT, $output) or die "Unable to open the output file:$!\n";

#Setup to while loop to process each line
while ($line = <VPNLOG>) {
chomp $line; #Remove the line breaks

#Strip the log's timestamp and IP
$line =~ s/.*Juniper:\s(.*)$/$1/;

#If line contains "Administrators" or "(Admin Users)" ignore it and move on to the next line
unless ($line =~ m/Administrators|(Admin Users)|System()/) {
#Split the line into the @fields array on every " " encountered
@fields = split (/ /, $line);
$jdate = $fields[0];                     #Juniper datestamp
$jdate =~ s/-//g;                        #Remove any occurance of "-" from the date stamp
$jtime = $fields[1];                     #Juniper timestamp
$userid = $fields[6];                    #User ID
$userid =~ s/XXXXXXX.|\(.*\)\[(.*)\]//g; #Remove the "XXXXXXX\" preceding the username and the "(Realm)[Role    ]"
                                     #trailing the username
#Normalize and recombine jtime and jdate here:
$timestamp = "$jdate $jtime";
#Check to see if line contains string "VPN Tunneling: Session started for user"

if ($line =~ m/VPN Tunneling: Session started for user/) {
++$maxrow;
$dhcpaddr = $fields[17];          #Destination IP address
$dhcpaddr =~ s/,//g;              #Remove "," trailing the IP address
$user[$maxrow] = $userid;
$dhcp[$maxrow] = $dhcpaddr;
$login[$maxrow] = $timestamp;
$logout[$maxrow] = "--";
$close[$maxrow] = "--";
$timeout[$maxrow] = "--";
$maxtimeout[$maxrow] = "--";
}

elsif ($line =~m/Logout/) {
$dhcpaddr = $fields[10];           #DHCP IP address
$sessionid = $fields[11];          #Session ID
$sessionid =~ s/\(session:|\)//g;   #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
    if ($user[$currow] eq $userid and $logout[$currow] eq "--") {
        $logout[$currow] = $timestamp;
        last;
    }
  }
}

elsif ($line =~m/Closed connection/) {
$dhcpaddr = $fields[11];         #DHCP IP Address
$sessiondur = $fields[13];       #Duration of session in seconds
$bytesr = $fields[16];           #Bytes read
$bytesw = $fields[20];           #Bytes written
for ($currow = $maxrow; $currow >= 1; $currow--) {
     if ($user[$currow] eq $userid and $close[$currow] eq "--") {
         $close[$currow] = $timestamp;
         last;
     }
   }
}

elsif ($line =~m/Session timed out/) {
$sessionid = $fields[13];          #Session ID
$sessionid =~ s/\(session:|\)//g;  #Remove the "(session:" and ")" from the session ID
$lastacctime = $fields[20];        #Last accessed time
$lastaccdate = $fields[21];        #Last accessed date
$lastaccdate =~ s/\).//g;          #Remove the ")" from the last access date
for ($currow = $maxrow; $currow >= 1; $currow--) {
   if ($user[$currow] eq $userid and $timeout[$currow] eq "--") {
       $timeout[$currow] = $timestamp;
       last;
     }
   }
}

elsif ($line =~m/Max session timeout/) {
$sessionid = $fields[13];          #Session ID
$sessionid =~ s/\(session:|\).//g; #Remove the "(session:" and ")" from the session ID
for ($currow = $maxrow; $currow >= 1; $currow--) {
     if ($user[$currow] eq $userid and $maxtimeout[$currow] eq "--") {
         $maxtimeout[$currow] = $timestamp;
         last;
    }
  }
}

}
}

#Define the format then output file(s) using printf
#Print the Column headers: UserID, Logon, Logout, Timeout, Maxtimout, Close, Duration
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", "UserID", "DHCP", "Logon ", "Logout", "Timeout", "Maxtimout", "Close stamp");
print VPNREPORT "-------------------------------------------------------------------------------------------    ----------------------------\n";

#Newest record at top of report
#for ($i = $maxrow; $i >= 1; $i--) {

#Oldest record at top of report
for ($i = 0; $i <= $maxrow; $i++) {
printf VPNREPORT ("%-12s %-12s %-18s %-18s %-18s %-18s %-18s\n", $user[$i], $dhcp[$i], $login[$i], $logout[$    i], $timeout[$i], $maxtimeout[$i], $close[$i]);
}

#Close the input and output files
close (VPNLOG);
close (VPNREPORT);