从文本文件中提取多个模式并将其保存到panda dataframe[python]_Python_Regex_Pandas

从文本文件中提取多个模式并将其保存到panda dataframe[python]

python regex pandas

从文本文件中提取多个模式并将其保存到panda dataframe[python],python,regex,pandas,Python,Regex,Pandas,我的文本文件如下所示 Description: Text 1 follows blah blah blah Cause: Cause Text 1 follows here Description: Text 2 follows blah blah blah Cause: Cause Text 2 follows here Description: Text 3 follows

我的文本文件如下所示

Description: Text 1 follows <br/> blah blah blah Cause: Cause Text 1 
follows here <br/>Description: Text 2 follows <br/> blah blah 
blah Cause: Cause Text 2 follows here<br/>Description: Text 3 follows <br/> 
blah blah blah Description: Text 4 follows <br/> blah blah 
blah Cause: Cause Text 4 follows<br/>

到目前为止我所做的：

re.findall(r'Description:(.*?)<br/>',textfile)
re.findall(r'Cause:(.*?)<br/>',textfile)

re.findall（r'Description:（.*）
，文本文件）
关于findall（r'原因：（.*）
，textfile）

但是当我尝试创建更大的数据帧时，这不允许我匹配描述和原因

感谢您的任何意见或指导。非常新的python

这是我想到的

r"Description:(.*?)<br/>(?:(?!Cause)(?!Description).)*(?:Cause:(.*?)<br/>)?"

尝试

r"Description:(.*?)<br/>(?:(?!Cause)(?!Description).)*(?:Cause:(.*?)<br/>)?"

data = re.findall(r"Description:(.*?)<br/>(?:(?!Cause)(?!Description).)*(?:Cause:(.*?)<br/>)?", textfile)
df = pandas.DataFrame(data, columns=("Description", "Cause"))