Python:如何从中间有两行空格的文件中读取行

Python:如何从中间有两行空格的文件中读取行,python,pandas,dataframe,readfile,Python,Pandas,Dataframe,Readfile,我试图读取的文件格式如下:每行之间有两个'\n'空格 Great tool for healing your life--if you are ready to change your beliefs!<br /><a href="http Bought this book for a friend. I read it years ago and it is one of those books you keep forever. Love it! I read th

我试图读取的文件格式如下:每行之间有两个'\n'空格

Great tool for healing your life--if you are ready to change your beliefs!<br /><a href="http


Bought this book for a friend. I read it years ago and it is one of those books you keep forever. Love it!


I read this book many years ago and have heard Louise Hay speak a couple of times.  It is a valuable read...
我得到的结果如下:

0  I was very inspired by Louise's Hay approach t...
1  \n You Can Heal Your Life by 
2  \n I had an older version
3  \n I love Louise Hay and
4  \n I thought the book was exellent
因为我使用了两个(\n),所以它会附加在每行的开头。是否有其他方法来处理此问题,以便获得如下输出:

0  I was very inspired by Louise's Hay approach t...
1  You Can Heal Your Life by 
2  I had an older version
3  I love Louise Hay and
4  I thought the book was exellent
尝试使用.stip()方法。它将从字符串的开头或结尾删除任何不必要的空白字符

您可以这样使用它:

for r in open_review.split('\n\n'):
    documents.append(r.strip())
尝试使用.stip()方法。它将从字符串的开头或结尾删除任何不必要的空白字符

您可以这样使用它:

for r in open_review.split('\n\n'):
    documents.append(r.strip())
使用
readlines()
并使用
strip()
清洁管路

使用
readlines()
并使用
strip()
清洁管路


这将附加每个非空行

filename = "..."
lines = []
with open(filename) as f:
    for line in f:
        line = line.strip()
        if line:
            lines.append(line)

>>> lines
['Great tool for healing your life--if you are ready to change your beliefs!<br /><a href="http',
 'Bought this book for a friend. I read it years ago and it is one of those books you keep forever. Love it!',
 'I read this book many years ago and have heard Louise Hay speak a couple of times.  It is a valuable read...']

lines = pd.DataFrame(lines, columns=['my_text'])
>>> lines
                                             my_text
0  Great tool for healing your life--if you are r...
1  Bought this book for a friend. I read it years...
2  I read this book many years ago and have heard...
filename=“…”
行=[]
打开(文件名)为f时:
对于f中的行:
line=line.strip()
如果行:
行。追加(行)
>>>线条

['治愈你生命的伟大工具--如果你准备好改变你的信仰的话!
这是每一个非空白行的附件

filename = "..."
lines = []
with open(filename) as f:
    for line in f:
        line = line.strip()
        if line:
            lines.append(line)

>>> lines
['Great tool for healing your life--if you are ready to change your beliefs!<br /><a href="http',
 'Bought this book for a friend. I read it years ago and it is one of those books you keep forever. Love it!',
 'I read this book many years ago and have heard Louise Hay speak a couple of times.  It is a valuable read...']

lines = pd.DataFrame(lines, columns=['my_text'])
>>> lines
                                             my_text
0  Great tool for healing your life--if you are r...
1  Bought this book for a friend. I read it years...
2  I read this book many years ago and have heard...
filename=“…”
行=[]
打开(文件名)为f时:
对于f中的行:
line=line.strip()
如果行:
行。追加(行)
>>>线条

['如果你准备改变你的信仰,这是治愈你生命的伟大工具!
这个答案虽然更长,但更可取,因为它不会读取for循环中的整个文件。调用
readlines()
strip()
会将整个文件读取到内存中。但是,这是一种不知道文件大小的更好方法。这个答案虽然更长,但更可取,因为它不会在for循环中读取整个文件。调用
readlines()
strip()
会将整个文件读取到内存中。但是,这是一种不知道文件大小的更好方法。