Python 获取缩写和定义的程序-获取所有小写缩写时遇到问题_Python_Regex_Text_Text Parsing

Python 获取缩写和定义的程序-获取所有小写缩写时遇到问题

python regex text

Python 获取缩写和定义的程序-获取所有小写缩写时遇到问题,python,regex,text,text-parsing,Python,Regex,Text,Text Parsing,我有一个程序，它可以抓取缩写词（即，查找括号中的单词），然后根据缩写词中的字符数，返回许多单词并对其进行定义。到目前为止，它适用于定义，比如前面的单词以大写字母开头，或者大多数前面的单词以大写字母开头。对于后者，它跳过像“in”这样的小写字母，转到下一个。然而，我的问题是当对应的单词数都是小写时电流输出：所有帅哥（AAD）临床试验中的方法、测量和疼痛评估倡议（IMMPACT）审判（IMMPACT）。一些患者更喜欢常规护理（UC）期望输出：所有帅哥（AAD）临床试验中的方法、测量和疼

我有一个程序，它可以抓取缩写词（即，查找括号中的单词），然后根据缩写词中的字符数，返回许多单词并对其进行定义。到目前为止，它适用于定义，比如前面的单词以大写字母开头，或者大多数前面的单词以大写字母开头。对于后者，它跳过像“in”这样的小写字母，转到下一个。然而，我的问题是当对应的单词数都是小写时

电流输出：

所有帅哥（AAD）
临床试验中的方法、测量和疼痛评估倡议（IMMPACT）
审判（IMMPACT）。一些患者更喜欢常规护理（UC）

期望输出：

所有帅哥（AAD）
临床试验中的方法、测量和疼痛评估倡议（IMMPACT）
日常护理（UC）

该标志用于罕见的情况，如

All are Awesome Dudes（AAD）

import re

s = """Too many people, but not All Awesome Dudes (AAD) only care about the 
Initiative on Methods, Measurement, and Pain Assessment in Clinical 
Trials (IMMPACT). Some patient perfer the usual care (UC) approach of 
doing nothing"""
allabbre = []

for match in re.finditer(r"\((.*?)\)", s):
    start_index = match.start()
    abbr = match.group(1)
    size = len(abbr)
    words = s[:start_index].split()
    count=0
    for k,i in enumerate(words[::-1]):
      if i[0].isupper():count+=1
      if count==size:break
    words=words[-k-1:] 
    definition = " ".join(words)
    abbr_keywords = definition + " " + "(" + abbr + ")"
    pattern='[A-Z]'

    if re.search(pattern, abbr):
      if abbr_keywords not in allabbre:
          allabbre.append(abbr_keywords)
      print(abbr_keywords)

import re

s = """Too many people, but not All Awesome Dudes (AAD) only care about the 
Initiative on Methods, Measurement, and Pain Assessment in Clinical 
Trials (IMMPACT). Some patient perfer the usual care (UC) approach of 
doing nothing
"""
allabbre = []

for match in re.finditer(r"\((.*?)\)", s):
    start_index = match.start()
    abbr = match.group(1)
    size = len(abbr)
    words = s[:start_index].split()
    count=size-1
    flag=words[-1][0].isupper()
    for k,i in enumerate(words[::-1]):
        first_letter=i[0] if flag else i[0].upper()
        if first_letter==abbr[count]:count-=1
        if count==-1:break
    words=words[-k-1:] 
    definition = " ".join(words)
    abbr_keywords = definition + " " + "(" + abbr + ")"
    pattern='[A-Z]'

    if re.search(pattern, abbr):
      if abbr_keywords not in allabbre:
          allabbre.append(abbr_keywords)
      print(abbr_keywords)