Python从readlines（）读取前四行_Python

Python从readlines（）读取前四行

python

Python从readlines（）读取前四行,python,Python,如何读取readlines（）中的前四行，我将从代理中获得一个STDIN到我的脚本： GET http://www.yum.com/ HTTP/1.1 Host: www.yum.com User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0

如何读取

readlines（）

中的前四行，我将从代理中获得一个

STDIN

到我的脚本：

GET http://www.yum.com/ HTTP/1.1
Host: www.yum.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Proxy-Connection: keep-alive

我使用

sys.stdin.readlines（）

读取它并将其记录到文件中，但我只想将

GET

和

User Agent

行记录到文件中

while True:
    line = sys.stdin.readlines()
    for l in line:
        log = open('/tmp/redirect.log', 'a')
        log.write(l)
        log.close()

假设你的输入总是从你想要得到的4行开始，这应该是可行的：

log = open('/tmp/redirect.log', 'a') 
for l in sys.stdin.readlines()[:4]:
    log.write(l)
log.close()

否则，您需要解析输入并可能使用regex（这还有另一个答案）。

假设您的输入总是从您想要得到的4行开始，这应该是可行的：

log = open('/tmp/redirect.log', 'a') 
for l in sys.stdin.readlines()[:4]:
    log.write(l)
log.close()

否则，您需要解析输入并可能使用regex（还有另一个答案）。

在写入日志之前，您可以检查行的内容：

while True:
    lines = sys.stdin.readlines()
    for line in lines:
        if line.startswith('GET') or line.startswith('User-Agent:'):
            log = open('/tmp/redirect.log', 'a')
            log.write(l)
            log.close()

对于更复杂的检查，您还可以使用正则表达式。

您可以在写入日志之前检查行的内容：

while True:
    lines = sys.stdin.readlines()
    for line in lines:
        if line.startswith('GET') or line.startswith('User-Agent:'):
            log = open('/tmp/redirect.log', 'a')
            log.write(l)
            log.close()

对于更复杂的检查，您还可以使用正则表达式。

将

与

一起使用可确保日志的良好关闭。您可以像对待Python中的任何文件类型对象一样迭代sys.stdin，这会更快，因为它不需要创建列表

with open('/tmp/redirect.log', 'a') as log:
    while True: #If you need to continuously check for more.
        for line in sys.stdin:
            if line.startswith(("GET", "User-Agent")):
                log.write(line)

以下是一种有效的方法，因为它不会一次又一次地检查相同的行，只会在需要保留行时进行检查。考虑到这种情况，可能不需要这样做，但是如果你有更多的项目要检查，以及更多的事情要整理，那么你就值得去做。这也意味着你要跟踪你所拥有的部分，并且不要阅读超出你需要的内容。如果阅读是一项昂贵的操作，这可能是有价值的

with open('/tmp/redirect.log', 'a') as log:
    while True: #If you need to continuously check for more.
        needed = {"GET", "User-Agent"}
        for line in sys.stdin:
            for item in needed:
                if line.startswith(item):
                    log.write(line)
                    break
            needed.remove(item)
            if not needed: #The set is empty, we have found all the lines we need.
                break

集合是无序的，但我们可以假设行将按顺序出现，因此按顺序记录

对于更复杂的行检查（例如：使用正则表达式），也可能需要这种设置。但是，在您的例子中，第一个示例很简洁，应该可以很好地工作。

将

与

一起使用可以确保日志的良好关闭。

>>> lines
0: ['GET http://www.yum.com/ HTTP/1.1',
'Host: www.yum.com',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-gb,en;q=0.5',
'Accept-Encoding: gzip, deflate',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Proxy-Connection: keep-alive']
>>> patterns = ["GET", "User-Agent"]
>>> for line in lines:
...     for pattern in patterns:
...         if line.startswith(pattern):
...             with open("/tmp/redirect.log", "a") as f:
...                 f.write(line)
                break

您可以像对待Python中的任何文件类型对象一样迭代sys.stdin，这会更快，因为它不需要创建列表

with open('/tmp/redirect.log', 'a') as log:
    while True: #If you need to continuously check for more.
        for line in sys.stdin:
            if line.startswith(("GET", "User-Agent")):
                log.write(line)

with open('/tmp/redirect.log', 'a') as log:
    while True: #If you need to continuously check for more.
        needed = {"GET", "User-Agent"}
        for line in sys.stdin:
            for item in needed:
                if line.startswith(item):
                    log.write(line)
                    break
            needed.remove(item)
            if not needed: #The set is empty, we have found all the lines we need.
                break

集合是无序的，但我们可以假设行将按顺序出现，因此按顺序记录

对于更复杂的行检查（例如：使用正则表达式），也可能需要这种设置。然而，在您的案例中，第一个示例很简洁，应该可以很好地工作

>>> lines
0: ['GET http://www.yum.com/ HTTP/1.1',
'Host: www.yum.com',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-gb,en;q=0.5',
'Accept-Encoding: gzip, deflate',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Proxy-Connection: keep-alive']
>>> patterns = ["GET", "User-Agent"]
>>> for line in lines:
...     for pattern in patterns:
...         if line.startswith(pattern):
...             with open("/tmp/redirect.log", "a") as f:
...                 f.write(line)
                break

if语句中应使用

with

，如果行列表很长，这将导致文件处理程序长时间打开<使用代码>中断，因为每一行将只匹配一个模式，如果一行已经匹配了一个模式，则无需检查列表中的其他模式

if语句中应使用

with

，如果行列表很长，这将导致文件处理程序长时间打开<使用代码>中断是因为每一行只匹配一种模式，如果一行已经匹配了一种模式，则无需检查列表中的其他模式。

每个缩进级别使用4个空格。为什么要为每一行读取的内容打开、写入和关闭文件？这似乎很愚蠢。我只是在测试输出，所以通过记录到一个文件来检查它。每个缩进级别使用4个空格。为什么要为每个读取的行打开、写入和关闭文件？这似乎很愚蠢。我只是在测试输出，所以通过记录到一个文件来检查它。为什么要打开和关闭附加的每一行的日志？我只是复制了他的代码，答案是readlines（）[：4]-我同意这根本不需要。编辑。为什么要打开和关闭附加的每一行的日志？我刚刚复制了他的代码，答案是readlines（）[：4]-我同意这根本不需要。Editing.com给了我这个错误：if line.startswith（'GET'）：AttributeError:'list'对象没有属性'startswith'@kridigitx，这是因为变量名有问题。我已经用更好的名称更新了我的答案。谢谢jcollado…我也使用了你的部分解决方案…干杯告诉我这个错误：if line.startswith（'GET'）：AttributeError:'list'对象没有属性'startswith'@kridigitx，这是因为变量名有问题。我用更好的名字更新了我的答案。谢谢jcollado…我也使用了你的部分解决方案…cheershi lattyware，这是一个很好的解决方案……但我有另一个函数，它在这条线上进行操作，看起来它没有打破循环properly@Lattyware很高兴知道

startswith

也接受字符串元组。hi lattyware，这是一个很好的解决方案……但我有另一个函数，它在这条线上进行操作，看起来它没有打破循环properly@Lattyware很高兴知道

startswith

也接受字符串的元组。