Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/301.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
打开一个文件,读取内容,使用regex将内容生成一个列表,然后用python打印列表_Python_Regex - Fatal编程技术网

打开一个文件,读取内容,使用regex将内容生成一个列表,然后用python打印列表

打开一个文件,读取内容,使用regex将内容生成一个列表,然后用python打印列表,python,regex,Python,Regex,我正在使用“导入re和sys” 在终端上,当我键入“1.py a.txt”时 我想让它读“a.txt”,它有以下内容: 17:18:42.525964 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 1:1449, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 1448 17:18:42.526623 IP 66.18

我正在使用“导入re和sys”

在终端上,当我键入“1.py a.txt”时 我想让它读“a.txt”,它有以下内容:

17:18:42.525964 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 1:1449, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 1448
17:18:42.526623 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 1449:2897, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 1448
17:18:42.526900 IP 192.168.0.15.34436 > 66.185.85.146.80: Flags [.], ack 2897, win 1444, options [nop,nop,TS val 43002448 ecr 3551057710], length 0
17:18:42.527694 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 2897:14481, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 11584
17:18:42.527716 IP 192.168.0.15.34436 > 66.185.85.146.80: Flags [.], ack 14481, win 1444, options [nop,nop,TS val 43002448 ecr 3551057710], length 0
17:18:42.528794 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 14481:23169, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 8688
17:18:42.528813 IP 192.168.0.15.34436 > 66.185.85.146.80: Flags [.], ack 23169, win 1444, options [nop,nop,TS val 43002448 ecr 3551057710], length 0
17:18:42.545191 IP 192.168.0.15.60030 > 52.2.63.29.80: Flags [.], seq 4113773418:4113774866, ack 850072640, win 270, options [nop,nop,TS val 43002452 ecr 9849626], length 1448
然后使用regex删除除ip地址和长度(总计)以外的所有内容,并将其打印为:

source: 66.185.85.146 dest: 192.168.0.15 total:1448
source: 66.185.85.146 dest: 192.168.0.15 total:1448
source: 192.168.0.15 dest: 66.185.85.146 total:0
但如果存在重复项,则其内容如下,其中将添加重复项的总量:

source: 66.185.85.146 dest: 192.168.0.15 total:2896
source: 192.168.0.15 dest: 66.185.85.146 total:0
此外,如果我像这样在终端中键入“-s”:

"1.py -s a.txt"

它应该排序,对于第一个-s,它将排序并打印内容,如果是-s ip,则对ip进行排序

目前,这是我为每一个项目,我想知道如何使用它们一起

#!/usr/bin/python3
import re
import sys

file = sys.argv[1]
a = open(file, "r")

for line in a:
   line = line.rstrip()
   c = re.findall(r'^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$',line) #Yes I know its not the best regex for this, but I am testing it out for now
   d = re.findall(r'\b(\d+)$\b',line)

   if len(c) > 0 and len(d) > 0:
      print("source:", c[0],"\t","dest:",c[1],"\t", "total:",d[0])

这就是我目前所拥有的,我不知道如何使用“-s”或如何排序,以及如何删除重复项,并在删除重复项时添加总数。

要阅读
-s
,您可能需要一个库来解析参数,就像标准一样。它允许您指定脚本所需的参数及其描述,并解析这些参数并确保其格式

要对列表进行排序,可以使用
排序(我的列表)
功能

最后,为了确保没有重复,您可以使用
集合
。这将丢失列表排序,但由于您稍后将对其进行排序,因此应该不会有问题

另外,还有专门用于添加分组值并对其进行排序的集合

from collections import Counter

results = Counter()

for line in a:
    line = line.rstrip()
    c = re.findall(r'^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$',line) #Yes I know its not the best regex for this, but I am testing it out for now
    d = re.findall(r'\b(\d+)$\b',line)

    if len(c) > 0 and len(d) > 0:
        source, destination, length = c[0], c[1], d[0]
        results[(source, destination)] += int(length)

# Print the sorted items.
for (source, destination), length in results.most_common():
    print("source:", source, "\t", "dest:", destination, "\t", "total:", length)

对于
-s
参数,您需要的是
ArgumentParser
,例如:

import argparse
...
def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('-s', '--sort', action='append',
                    help='sort specific IP')
    parser.add_argument('-s2', '--sortall', action='store_true',
                    help='sort all the IPs')

    args = parser.parse_args()
    if args.sortall:
        # store all Ips

    for ip in args.sort:
        # store by ip
if __name__ == '__main__':
    main()
现在,您可以使用以下脚本:

1.py a.txt -s 192.168.0.15


除此之外,关于如何将所有内容组合在一起,看起来像是一个家庭作业,因此您应该阅读更多关于python的内容来了解它。

要添加的ArgumentParser-顺便说一句,对于输入文件路径,代码可以很好地工作-

import re
from  collections import defaultdict 

with open(r"C:\ips.txt",'rb') as ip_file:
    txt = ip_file.read()
    ip=re.findall(r'[0-9.]+[\s]+[>][\s0-9.]+',txt)
    ip1 = ['>'.join(re.findall(r'[0-9.]+(?=[.])',i)) for i in ip]
    packs = re.findall(r'(?<=length )[0-9]+',txt)
    data = zip(ip1,packs)
    d = defaultdict(list)
    for k, v in data:
        d[k].append(v)
    for i,j in d.items():
        source,destination = i.split('>')[0],i.split('>')[1]
        print "source: {0} destination: {1} total: {2}".format(source,destination,sum(map(int,j)))

谢谢,试一下这个。这个长度加起来不够长。相反,它将添加多个不同长度的源/目标组合,并将丢弃已看到长度的源/目标组合。很好。马上修好。我认为使用
计数器
更适合这份工作。我更新了我的答案以反映这一点。您的赋值行为变量赋值错误(或者您喜欢混淆变量名^-)。它应该是
source,destination,length=c[0],c[1],d[0]
我在args.sort中的“for-ip”处得到缩进错误:“@eLRuLLI在并没有端口的情况下尝试了它,正如你们所做的那个样,但我得到了一个错误,”indexer:list-index-out-range“@sislamp请记住接受一个对你们有用的答案。
1.py a.txt -s2
import re
from  collections import defaultdict 

with open(r"C:\ips.txt",'rb') as ip_file:
    txt = ip_file.read()
    ip=re.findall(r'[0-9.]+[\s]+[>][\s0-9.]+',txt)
    ip1 = ['>'.join(re.findall(r'[0-9.]+(?=[.])',i)) for i in ip]
    packs = re.findall(r'(?<=length )[0-9]+',txt)
    data = zip(ip1,packs)
    d = defaultdict(list)
    for k, v in data:
        d[k].append(v)
    for i,j in d.items():
        source,destination = i.split('>')[0],i.split('>')[1]
        print "source: {0} destination: {1} total: {2}".format(source,destination,sum(map(int,j)))
source: 192.168.0.15 destination: 66.185.85.146 total: 0
source: 66.185.85.146 destination: 192.168.0.15 total: 23168
source: 192.168.0.15 destination: 52.2.63.29 total: 1448