Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/vb.net/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 如何在python中提取字符串并将其作为文本文件中的多行写入?_Python 2.7_Ip Address - Fatal编程技术网

Python 2.7 如何在python中提取字符串并将其作为文本文件中的多行写入?

Python 2.7 如何在python中提取字符串并将其作为文本文件中的多行写入?,python-2.7,ip-address,Python 2.7,Ip Address,这里是Python新手 我试图从log.txt文件中获取最活跃的ip地址,并将其打印到另一个文本文件中。我的第一步是获取所有ip地址。其次,对最常出现的ip地址进行排序。但我陷入了第一步,那就是: with open('./log_input/log.txt', 'r+') as f: # loops the lines in teh text file for line in f: # split line at whitespace cols

这里是Python新手

我试图从log.txt文件中获取最活跃的ip地址,并将其打印到另一个文本文件中。我的第一步是获取所有ip地址。其次,对最常出现的ip地址进行排序。但我陷入了第一步,那就是:

with open('./log_input/log.txt', 'r+') as f:
    # loops the lines in teh text file
    for line in f:
        # split line at whitespace
        cols = line.split()

        # get last column
        byte_size = cols[-1]

        # get the first column [0]
        ip_addresses = cols[0]

        # remove brackets
        byte_size = byte_size.strip('[]')

        # write the byte size in the resource file
        resource_file = open('./log_output/resources.txt', 'a')
        resource_file.write(byte_size + '\n')
        resource_file.truncate()
        # write the ip addresses in the host file
        host_file = open('./log_output/hosts.txt', 'a')
        host_file.seek(0)
        host_file.write(ip_addresses + '\n')
        host_file.truncate()

    resource_file.close()
    host_file.close()
问题是在新的host.txt文件中,它会重新打印ip地址,而不是覆盖。我也试过:

    resource_file = open('./log_output/resources.txt', 'w')
    host_file = open('./log_output/hosts.txt', 'w')
“w+”
等等。。但是
w
w+
在主机文件中只提供一个ip地址

有人能给我指点迷津吗

示例输入文件


collections.Counter
是一个方便的计数工具。向它输入一组文本字符串,它将创建一个
dict
将文本映射到该文本的显示次数。现在计算IP地址很容易

>>> import collections
>>> with open('log.txt') as fp:
...     counter = collections.Counter(line.split(' ', 1)[0].lower() for line in fp)
... 
>>> counter
Counter({'isdn6-34.dnai.com': 2, 'ix-ftw-tx1-24.ix.netcom.com': 1, 'www-c2.proxy.aol.com': 1})
>>> counter.most_common(1)
[('isdn6-34.dnai.com', 2)]
>>>
>>>
>>> with open('most_common.txt', 'w') as fp:
...     fp.write(counter.most_common(1)[0][0])
... 
17
>>> open('most_common.txt').read()
'isdn6-34.dnai.com'

谢谢你的帮助和建议。。这解决了我的问题

with open('./log_input/log.txt', 'r+') as f:

# loops the lines in teh text file
new_ip_addresses = ""
new_byte_sizes = ""
new_time_stamp = ""
resource_file = open('./log_output/resources.txt', 'w')
host_file = open('./log_output/hosts.txt', 'w')
hours_file = open('./log_output/hours.txt', 'w')

for line in f:
    # print re.findall("\[(.*?)\]", line)  # ['Hi all', 'this is', 'an example']

    # split line at whitespace
    cols = line.split(' ')

    #get the time stamp times


    # print(cols[4])

    # get byte sizes from the
    byte_size = cols[-1]
    new_byte_sizes += byte_size

    # get  ip/host
    ip_addresses = cols[0]
    new_ip_addresses += ip_addresses + '\n'

    # remove brackets
    byte_size = byte_size.strip('[]')

# write the byte size in the resource file
print(new_byte_sizes)
resource_file.write(new_byte_sizes)
resource_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()
基本上,将值赋给for循环内的变量,并添加新行,我就解决了这个问题


new\u ip\u addresses+=ip\u addresses+'\n'

我首先建议只打开一次资源文件:资源文件=打开('./log\u output/resources.txt',a')应该在启动for循环之前打开。主机_文件也是一样。你能发布一些输入文件的示例行以便我们进行测试吗?它会重新打印ip地址,而不是覆盖。。。我不知道那是什么意思。你想在那个文件里写什么?所有地址都有重复项,所有地址都没有重复项?一个问题是您写入并截断了文件,但没有关闭文件。因此,下一个
host\u file=open('./log\u output/hosts.txt',a')
打开一个过时的文件版本,然后当它重新分配
host\u文件时,上一个循环的数据被刷新到该文件中。使用后关闭该设备或将其放入
子句中。www-c2.proxy.aol.com---[01/Jul/1995:00:03:52-0400]“GET/history/skylab/skylab-1.html HTTP/1.0”200 1659 isdn6-34.dnai.com---[01/Jul/1995:00:03:52-0400]“GET/images/kscmap-tiny.gif HTTP/1.0”200 2537 isdn6-34.dnai.com---[01/Jul/1995:00:03-0400]“GET/images/ksclogosmall.gif HTTP/1.0”200 3635 ix-ftw-tx1-24.ix.netcom.com--[01/Jul/1995:00:03:52-0400]“GET/shutter/countdown/count.gif HTTP/1.0”200 40310
with open('./log_input/log.txt', 'r+') as f:

# loops the lines in teh text file
new_ip_addresses = ""
new_byte_sizes = ""
new_time_stamp = ""
resource_file = open('./log_output/resources.txt', 'w')
host_file = open('./log_output/hosts.txt', 'w')
hours_file = open('./log_output/hours.txt', 'w')

for line in f:
    # print re.findall("\[(.*?)\]", line)  # ['Hi all', 'this is', 'an example']

    # split line at whitespace
    cols = line.split(' ')

    #get the time stamp times


    # print(cols[4])

    # get byte sizes from the
    byte_size = cols[-1]
    new_byte_sizes += byte_size

    # get  ip/host
    ip_addresses = cols[0]
    new_ip_addresses += ip_addresses + '\n'

    # remove brackets
    byte_size = byte_size.strip('[]')

# write the byte size in the resource file
print(new_byte_sizes)
resource_file.write(new_byte_sizes)
resource_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()