Python 2.7 如何在python中提取字符串并将其作为文本文件中的多行写入？_Python 2.7_Ip Address

Python 2.7 如何在python中提取字符串并将其作为文本文件中的多行写入？

python-2.7

Python 2.7 如何在python中提取字符串并将其作为文本文件中的多行写入？,python-2.7,ip-address,Python 2.7,Ip Address,这里是Python新手我试图从log.txt文件中获取最活跃的ip地址，并将其打印到另一个文本文件中。我的第一步是获取所有ip地址。其次，对最常出现的ip地址进行排序。但我陷入了第一步，那就是： with open('./log_input/log.txt', 'r+') as f: # loops the lines in teh text file for line in f: # split line at whitespace cols

这里是Python新手

我试图从log.txt文件中获取最活跃的ip地址，并将其打印到另一个文本文件中。我的第一步是获取所有ip地址。其次，对最常出现的ip地址进行排序。但我陷入了第一步，那就是：

with open('./log_input/log.txt', 'r+') as f:
    # loops the lines in teh text file
    for line in f:
        # split line at whitespace
        cols = line.split()

        # get last column
        byte_size = cols[-1]

        # get the first column [0]
        ip_addresses = cols[0]

        # remove brackets
        byte_size = byte_size.strip('[]')

        # write the byte size in the resource file
        resource_file = open('./log_output/resources.txt', 'a')
        resource_file.write(byte_size + '\n')
        resource_file.truncate()
        # write the ip addresses in the host file
        host_file = open('./log_output/hosts.txt', 'a')
        host_file.seek(0)
        host_file.write(ip_addresses + '\n')
        host_file.truncate()

    resource_file.close()
    host_file.close()

问题是在新的host.txt文件中，它会重新打印ip地址，而不是覆盖。我也试过：

    resource_file = open('./log_output/resources.txt', 'w')
    host_file = open('./log_output/hosts.txt', 'w')

和

“w+”

等等。。但是

或

w+

在主机文件中只提供一个ip地址

有人能给我指点迷津吗

示例输入文件

collections.Counter

是一个方便的计数工具。向它输入一组文本字符串，它将创建一个

dict

将文本映射到该文本的显示次数。现在计算IP地址很容易

>>> import collections
>>> with open('log.txt') as fp:
...     counter = collections.Counter(line.split(' ', 1)[0].lower() for line in fp)
... 
>>> counter
Counter({'isdn6-34.dnai.com': 2, 'ix-ftw-tx1-24.ix.netcom.com': 1, 'www-c2.proxy.aol.com': 1})
>>> counter.most_common(1)
[('isdn6-34.dnai.com', 2)]
>>>
>>>
>>> with open('most_common.txt', 'w') as fp:
...     fp.write(counter.most_common(1)[0][0])
... 
17
>>> open('most_common.txt').read()
'isdn6-34.dnai.com'

谢谢你的帮助和建议。。这解决了我的问题

with open('./log_input/log.txt', 'r+') as f:

# loops the lines in teh text file
new_ip_addresses = ""
new_byte_sizes = ""
new_time_stamp = ""
resource_file = open('./log_output/resources.txt', 'w')
host_file = open('./log_output/hosts.txt', 'w')
hours_file = open('./log_output/hours.txt', 'w')

for line in f:
    # print re.findall("\[(.*?)\]", line)  # ['Hi all', 'this is', 'an example']

    # split line at whitespace
    cols = line.split(' ')

    #get the time stamp times


    # print(cols[4])

    # get byte sizes from the
    byte_size = cols[-1]
    new_byte_sizes += byte_size

    # get  ip/host
    ip_addresses = cols[0]
    new_ip_addresses += ip_addresses + '\n'

    # remove brackets
    byte_size = byte_size.strip('[]')

# write the byte size in the resource file
print(new_byte_sizes)
resource_file.write(new_byte_sizes)
resource_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()

基本上，将值赋给for循环内的变量，并添加新行，我就解决了这个问题

new\u ip\u addresses+=ip\u addresses+'\n'

我首先建议只打开一次资源文件：资源文件=打开（'./log\u output/resources.txt'，a'）应该在启动for循环之前打开。主机_文件也是一样。你能发布一些输入文件的示例行以便我们进行测试吗？它会重新打印ip地址，而不是覆盖。。。我不知道那是什么意思。你想在那个文件里写什么？所有地址都有重复项，所有地址都没有重复项？一个问题是您写入并截断了文件，但没有关闭文件。因此，下一个

host\u file=open（'./log\u output/hosts.txt'，a'）

打开一个过时的文件版本，然后当它重新分配

host\u文件时，上一个循环的数据被刷新到该文件中。使用后关闭该设备或将其放入子句中。www-c2.proxy.aol.com---[01/Jul/1995:00:03:52-0400]“GET/history/skylab/skylab-1.html HTTP/1.0”200 1659 isdn6-34.dnai.com---[01/Jul/1995:00:03:52-0400]“GET/images/kscmap-tiny.gif HTTP/1.0”200 2537 isdn6-34.dnai.com---[01/Jul/1995:00:03-0400]“GET/images/ksclogosmall.gif HTTP/1.0”200 3635 ix-ftw-tx1-24.ix.netcom.com--[01/Jul/1995:00:03:52-0400]“GET/shutter/countdown/count.gif HTTP/1.0”200 40310
with open('./log_input/log.txt', 'r+') as f:

# loops the lines in teh text file
new_ip_addresses = ""
new_byte_sizes = ""
new_time_stamp = ""
resource_file = open('./log_output/resources.txt', 'w')
host_file = open('./log_output/hosts.txt', 'w')
hours_file = open('./log_output/hours.txt', 'w')

for line in f:
    # print re.findall("\[(.*?)\]", line)  # ['Hi all', 'this is', 'an example']

    # split line at whitespace
    cols = line.split(' ')

    #get the time stamp times


    # print(cols[4])

    # get byte sizes from the
    byte_size = cols[-1]
    new_byte_sizes += byte_size

    # get  ip/host
    ip_addresses = cols[0]
    new_ip_addresses += ip_addresses + '\n'

    # remove brackets
    byte_size = byte_size.strip('[]')

# write the byte size in the resource file
print(new_byte_sizes)
resource_file.write(new_byte_sizes)
resource_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()

# write the ip addresses in the host file
print(new_ip_addresses)
host_file.write(new_ip_addresses)
host_file.close()