Python 处理文件中多行和/或单行的函数_Python_File_Twitter

Python 处理文件中多行和/或单行的函数

python file twitter

Python 处理文件中多行和/或单行的函数,python,file,twitter,Python,File,Twitter,如果我有一个文件，我应该如何实现一个函数，使它既可以读取单行，也可以读取多行。例如： TimC Tim Cxe USA http://www.TimTimTim.com TimTim facebook! ENDBIO Charles Dwight END Mcdon Mcdonald Africa # website in here is empty, but we still need to consider it # bio in here is empty, bu

如果我有一个文件，我应该如何实现一个函数，使它既可以读取单行，也可以读取多行。例如：

TimC
Tim Cxe
USA
http://www.TimTimTim.com
TimTim facebook!
ENDBIO
Charles
Dwight
END
Mcdon
Mcdonald 
Africa
      # website in here is empty, but we still need to consider it
      # bio in here is empty, but we need to include this in the dict
      # bio can be multiple lines
ENDBIO
Moon
King
END
etc

我只是想知道是否有人可以使用一些python初学者的关键字（比如不要使用yield、break、continue）

在我自己的版本中，我实际上定义了4个函数。4个函数中有3个是辅助函数

我想要一个函数返回：

dict = {'TimC':{'name':Tim Cxd, 'location':'USA', 'Web':'http://www.TimTimTim.com', 'bio':'TimTim facebook!','follows': ['Charles','Dwight']}, 'Mcdon':{'name':Mcdonald , 'location':'Africa', 'Web':'', 'bio':'','follows': ['Moon','King']}}

遍历文件，收集各种数据，并在到达适当的sentinel时生成它

from itertools import izip

line_meanings = ("name", "location", "web")
result = {}
user = None

def readClean(iterable, sentinel=None):
    for line in iterable:
        line = line.strip()
        if line == sentinel:
            break
        yield line

while True:
    line = yourfile.readline()
    if not line:
        break
    line = line.strip()
    if not line:
        continue
    user = result[line] = {}
    user.update(izip(line_meanings, readClean(yourfile)))
    user['bio'] = list(readClean(yourfile, 'ENDBIO'))
    user['follows'] = set(readClean(yourfile, 'END'))

print result

代码不认为<代码> Bio/<代码>可以超过1行>代码> EndoBie< /Cord> @ NoSKIO，正确，问题不明确，但是OP指示BIO应该是一个字符串，而不是一个列表。我选择拒绝猜测。当问题被更新以显示如果有多行bio应该发生什么时，我可以更新我的答案

{'Mcdon': {'bio': [''],
           'follows': set(['King', 'Moon']),
           'location': 'Africa',
           'name': 'Mcdonald',
           'web': ''},
 'TimC': {'bio': ['TimTim facebook!'],
          'follows': set(['Charles', 'Dwight']),
          'location': 'USA',
          'name': 'Tim Cxe',
          'web': 'http://www.TimTimTim.com'}}

import sys

def bio_gen(it, sentinel="END"):
    def read_line():
        return next(it).partition("#")[0].strip() 

    while True:
        key = read_line()
        ret = {
            'name': read_line(),
            'location': read_line(),
            'website': read_line(),
            'bio': read_line(),
            'follows': []}
        next(it)                    #skip the ENDBIO line
        while True:
            line = read_line()
            if line == sentinel:
                yield key, ret
                break
            ret['follows'].append(line)

all_bios = dict(bio_gen(sys.stdin))
import pprint
pprint.pprint(all_bios)

{'Mcdon': {'bio': '',
           'follows': ['Moon', 'King'],
           'location': 'Africa',
           'name': 'Mcdonald',
           'website': ''},
 'TimC': {'bio': 'TimTim facebook!',
          'follows': ['Charles', 'Dwight'],
          'location': 'USA',
          'name': 'Tim Cxe',
          'website': 'http://www.TimTimTim.com'}}