Python 正则表达式:在两章之间抓取文本时,如何忽略目录?
我想抓住第1章的内容,但它一直在获取目录 我已经尝试过这篇Python 正则表达式:在两章之间抓取文本时,如何忽略目录?,python,regex,Python,Regex,我想抓住第1章的内容,但它一直在获取目录 我已经尝试过这篇第1章((.*\n)*)第2章,但它会抓住目录。如果我手动删除目录,它将正常工作 全文: 1. Chapter 1 2. Chapter 2 3. Chapter 3 Chapter 1 Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dumm
第1章((.*\n)*)第2章,但它会抓住目录。如果我手动删除目录,它将正常工作
全文:
1. Chapter 1
2. Chapter 2
3. Chapter 3
Chapter 1
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum\
Chapter 2
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum
如果您使用的是支持lookback的regex风格,那么可以在第1章
之前对\d.
使用负lookback,以避免与目录中的第1章
匹配。通过将第1章
作为正向前瞻,并对第2章
使用正向前瞻,您可以使所需文本完全匹配:
(?<=(?<!\d. )Chapter 1).*(?=Chapter 2)
(?
注意:您需要使用s
标志使
与换行符匹配。如果您使用的是支持lookbehind的正则表达式,您可以在第1章
之前对使用负lookbehind.
以避免在目录中匹配第1章
。通过将第1章
作为pos对于第2章
,您可以使用正向前瞻,使所需文本成为完整匹配:
(?<=(?<!\d. )Chapter 1).*(?=Chapter 2)
(?
注意,您需要使用s
标志使
匹配换行符。什么风格的正则表达式?JS、PHP、python等?@Nick its python正则表达式的什么风格?JS、PHP、python等?@Nick its python