ip地址的python解析文件_Python

ip地址的python解析文件

python

ip地址的python解析文件,python,Python,我有一个有几个IP地址的文件。4行txt上大约有900个IP。我想输出是每行1 IP。我怎样才能做到这一点？根据其他代码，我提出了这个，但它失败了，因为多个IP在一行上： import sys import re try: if sys.argv[1:]: print "File: %s" % (sys.argv[1]) logfile = sys.argv[1] else: logfile = raw_input("Please

我有一个有几个IP地址的文件。4行txt上大约有900个IP。我想输出是每行1 IP。我怎样才能做到这一点？根据其他代码，我提出了这个，但它失败了，因为多个IP在一行上：

import sys
import re

try:
    if sys.argv[1:]:
        print "File: %s" % (sys.argv[1])
        logfile = sys.argv[1]
    else:
        logfile = raw_input("Please enter a log file to parse, e.g /var/log/secure: ")
    try:
        file = open(logfile, "r")
        ips = []
        for text in file.readlines():
           text = text.rstrip()
           regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
           if regex is not None and regex not in ips:
               ips.append(regex)

        for ip in ips:
           outfile = open("/tmp/list.txt", "a")
           addy = "".join(ip)
           if addy is not '':
              print "IP: %s" % (addy)
              outfile.write(addy)
              outfile.write("\n")
    finally:
        file.close()
        outfile.close()
except IOError, (errno, strerror):
        print "I/O Error(%s) : %s" % (errno, strerror)

findall函数返回一个匹配数组，而不是遍历每个匹配

regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex is not None:
    for match in regex:
        if match not in ips:
            ips.append(match)

表达式中的

锚阻止您查找除最后一个条目以外的任何内容。删除该列表，然后使用

.findall（）返回的列表

：

re.findall（）

将始终返回一个列表，该列表可能为空

如果您只需要唯一的地址，请使用集合而不是列表

如果您需要验证IP地址（包括忽略专用网络和本地地址），请考虑使用.< /LI>

不带

re。多行

标志

仅在字符串末尾匹配

为了使调试更容易，将代码分成几个部分，您可以独立进行测试

def extract_ips(data):
    return re.findall(r"\d{1,3}(?:\.\d{1,3}){3}", data)

正则表达式过滤掉一些有效的IP，例如
相反，正则表达式匹配无效字符串，例如
```
999.999.999.999
```
。你可以。您甚至可以在不搜索ip的情况下将输入拆分为多个部分，并使用这些函数来保持有效的ip

如果输入文件很小，您不需要保留IP的原始顺序：

with open(filename) as infile, open(outfilename, "w") as outfile:
    outfile.write("\n".join(set(extract_ips(infile.read()))))

否则：

with open(filename) as infile, open(outfilename, "w") as outfile:
    seen = set()
    for line in infile:
        for ip in extract_ips(line):
            if ip not in seen:
               seen.add(ip)
               print >>outfile, ip

从文件中提取IP地址

我回答了一个类似的问题。简而言之，它是基于我正在进行的一个项目的解决方案，用于从不同类型的输入数据（例如字符串、文件、博客帖子等）中提取基于网络和主机的指标：

我将导入IP地址和数据类，然后使用它们以以下方式完成任务：

#!/usr/bin/env/python

"""Extract IPv4 Addresses From Input File."""

from Data import CleanData  # Format and Clean the Input Data.
from IPAddresses import ExtractIPs  # Extract IPs From Input Data.


def get_ip_addresses(input_file_path):
    """"
    Read contents of input file and extract IPv4 Addresses.
    :param iput_file_path: fully qualified path to input file. Expecting str
    :returns: dictionary of IPv4 and IPv4-like Address lists
    :rtype: dict
    """

    input_data = []  # Empty list to house formatted input data.

    input_data.extend(CleanData(input_file_path).to_list())

    results = ExtractIPs(input_data).get_ipv4_results()

    return results

现在您有了一个列表字典，您可以轻松地访问所需的数据并以任何方式输出它。下面的示例使用了上述功能；将结果打印到console，并将其写入指定的输出文件：

# Extract the desired data using the aforementioned function.
ipv4_list = get_ip_addresses('/path/to/input/file')

# Open your output file in 'append' mode.
with open('/path/to/output/file', 'a') as outfile:

    # Ensure that the list of valid IPv4 Addresses is not empty.
    if ipv4_list['valid_ips']:

        for ip_address in ipv4_list['valid_ips']:

            # Print to console
            print(ip_address)

            # Write to output file.
            outfile.write(ip_address)

您正在寻找IPv4地址的规范形式。请注意，还有其他可接受的形式，即使是IPv4地址。e、 g.如果在本地主机端口80（2130706433==0x7f000001==127.0.0.1）上运行HTTP服务器，请尝试。当然，如果您控制文件的格式，就不必担心这些事情。。。但是，如果您能够切实支持IPv6，它将证明您的脚本是可靠的。

re.findall（）

始终返回一个列表。它从来都不是

None

。

# Extract the desired data using the aforementioned function.
ipv4_list = get_ip_addresses('/path/to/input/file')

# Open your output file in 'append' mode.
with open('/path/to/output/file', 'a') as outfile:

    # Ensure that the list of valid IPv4 Addresses is not empty.
    if ipv4_list['valid_ips']:

        for ip_address in ipv4_list['valid_ips']:

            # Print to console
            print(ip_address)

            # Write to output file.
            outfile.write(ip_address)