ip地址的python解析文件
我有一个有几个IP地址的文件。4行txt上大约有900个IP。我想输出是每行1 IP。我怎样才能做到这一点?根据其他代码,我提出了这个,但它失败了,因为多个IP在一行上:ip地址的python解析文件,python,Python,我有一个有几个IP地址的文件。4行txt上大约有900个IP。我想输出是每行1 IP。我怎样才能做到这一点?根据其他代码,我提出了这个,但它失败了,因为多个IP在一行上: import sys import re try: if sys.argv[1:]: print "File: %s" % (sys.argv[1]) logfile = sys.argv[1] else: logfile = raw_input("Please
import sys
import re
try:
if sys.argv[1:]:
print "File: %s" % (sys.argv[1])
logfile = sys.argv[1]
else:
logfile = raw_input("Please enter a log file to parse, e.g /var/log/secure: ")
try:
file = open(logfile, "r")
ips = []
for text in file.readlines():
text = text.rstrip()
regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex is not None and regex not in ips:
ips.append(regex)
for ip in ips:
outfile = open("/tmp/list.txt", "a")
addy = "".join(ip)
if addy is not '':
print "IP: %s" % (addy)
outfile.write(addy)
outfile.write("\n")
finally:
file.close()
outfile.close()
except IOError, (errno, strerror):
print "I/O Error(%s) : %s" % (errno, strerror)
findall函数返回一个匹配数组,而不是遍历每个匹配
regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex is not None:
for match in regex:
if match not in ips:
ips.append(match)
表达式中的
$
锚阻止您查找除最后一个条目以外的任何内容。删除该列表,然后使用.findall()返回的列表
:
re.findall()
将始终返回一个列表,该列表可能为空
- 如果您只需要唯一的地址,请使用集合而不是列表
re。多行
标志$
仅在字符串末尾匹配
为了使调试更容易,将代码分成几个部分,您可以独立进行测试
def extract_ips(data):
return re.findall(r"\d{1,3}(?:\.\d{1,3}){3}", data)
- 正则表达式过滤掉一些有效的IP,例如
- 相反,正则表达式匹配无效字符串,例如
。你可以。您甚至可以在不搜索ip的情况下将输入拆分为多个部分,并使用这些函数来保持有效的ip999.999.999.999
with open(filename) as infile, open(outfilename, "w") as outfile:
outfile.write("\n".join(set(extract_ips(infile.read()))))
否则:
with open(filename) as infile, open(outfilename, "w") as outfile:
seen = set()
for line in infile:
for ip in extract_ips(line):
if ip not in seen:
seen.add(ip)
print >>outfile, ip
从文件中提取IP地址 我回答了一个类似的问题。简而言之,它是基于我正在进行的一个项目的解决方案,用于从不同类型的输入数据(例如字符串、文件、博客帖子等)中提取基于网络和主机的指标:
我将导入IP地址和数据类,然后使用它们以以下方式完成任务:
#!/usr/bin/env/python
"""Extract IPv4 Addresses From Input File."""
from Data import CleanData # Format and Clean the Input Data.
from IPAddresses import ExtractIPs # Extract IPs From Input Data.
def get_ip_addresses(input_file_path):
""""
Read contents of input file and extract IPv4 Addresses.
:param iput_file_path: fully qualified path to input file. Expecting str
:returns: dictionary of IPv4 and IPv4-like Address lists
:rtype: dict
"""
input_data = [] # Empty list to house formatted input data.
input_data.extend(CleanData(input_file_path).to_list())
results = ExtractIPs(input_data).get_ipv4_results()
return results
- 现在您有了一个列表字典,您可以轻松地访问所需的数据并以任何方式输出它。下面的示例使用了上述功能;将结果打印到console,并将其写入指定的输出文件:
# Extract the desired data using the aforementioned function. ipv4_list = get_ip_addresses('/path/to/input/file') # Open your output file in 'append' mode. with open('/path/to/output/file', 'a') as outfile: # Ensure that the list of valid IPv4 Addresses is not empty. if ipv4_list['valid_ips']: for ip_address in ipv4_list['valid_ips']: # Print to console print(ip_address) # Write to output file. outfile.write(ip_address)
re.findall()
始终返回一个列表。它从来都不是None
。
# Extract the desired data using the aforementioned function.
ipv4_list = get_ip_addresses('/path/to/input/file')
# Open your output file in 'append' mode.
with open('/path/to/output/file', 'a') as outfile:
# Ensure that the list of valid IPv4 Addresses is not empty.
if ipv4_list['valid_ips']:
for ip_address in ipv4_list['valid_ips']:
# Print to console
print(ip_address)
# Write to output file.
outfile.write(ip_address)