Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/324.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 有效地获取ID';来自PubMed_Python_Regex_Performance_Pubmed - Fatal编程技术网

Python 有效地获取ID';来自PubMed

Python 有效地获取ID';来自PubMed,python,regex,performance,pubmed,Python,Regex,Performance,Pubmed,我目前正在寻找PubMed/MEDLINE上的引文与临床试验注册之间的直接联系。具体地说,给定一个PMID,我希望在任何临床试验注册中心找到引用的所有ID。(例如,请参见具有id的PMID) 目前,我仅使用以下正则表达式搜索ClinicalTrials.gov(id格式:NCT,后跟8位数字(例如NCT01435343))的链接: attributes = {'mdTitle': 'High-dose versus standard-dose amoxicillin/clavulanate fo

我目前正在寻找PubMed/MEDLINE上的引文与临床试验注册之间的直接联系。具体地说,给定一个PMID,我希望在任何临床试验注册中心找到引用的所有ID。(例如,请参见具有id的PMID)

目前,我仅使用以下正则表达式搜索ClinicalTrials.gov(id格式:NCT,后跟8位数字(例如NCT01435343))的链接:

attributes = {'mdTitle': 'High-dose versus standard-dose amoxicillin/clavulanate for clinically-diagnosed acute bacterial sinusitis: A randomized clinical trial.', 'mdAbstract': 'BACKGROUND: The recommended treatment for acute bacterial sinusitis in adults, amoxicillin with clavulanate, provides only modest benefit. OBJECTIVE: To see if a higher dose of amoxicillin will lead to more rapid improvement. DESIGN, SETTING, AND PARTICIPANTS: Double-blind randomized trial in which, from November 2014 through February 2017, we enrolled 315 adult outpatients diagnosed with acute sinusitis in accordance with Infectious Disease Society of America guidelines. INTERVENTIONS: Standard-dose (SD) immediate-release (IR) amoxicillin/clavulanate 875 /125 mg (n = 159) vs. high-dose (HD) (n = 156). The original HD formulation, 2000 mg of extended-release (ER) amoxicillin with 125 mg of IR clavulanate twice a day, became unavailable half way through the study. The IRB then approved a revised protocol after patient 180 to provide 1750 mg of IR amoxicillin twice a day in the HD formulation and to compare Time Period 1 (ER) with Time Period 2 (IR). MAIN MEASURE: The primary outcome was the percentage in each group reporting a major improvement-defined as a global assessment of sinusitis symptoms as "a lot better" or "no symptoms"-after 3 days of treatment. KEY RESULTS: Major improvement after 3 days was reported during Period 1 by 38.8% of ER HD versus 37.9% of SD patients (P = 0.91) and during Period 2 by 52.4% of IR HD versus 34.4% of SD patients, an effect size of 18% (95% CI 0.75 to 35%, P = 0.04). No significant differences in efficacy were seen at Day 10. The major side effect, severe diarrhea at Day 3, was reported during Period 1 by 7.4% of HD and 5.7% of SD patients (P = 0.66) and during Period 2 by 15.8% of HD and 4.8% of SD patients (P = 0.048). CONCLUSIONS: Adults with clinically diagnosed acute bacterial sinusitis were more likely to improve rapidly when treated with IR HD than with SD but not when treated with ER HD. They were also more likely to suffer severe diarrhea. Further study is needed to confirm these findings. TRIAL REGISTRATION: ClinicalTrials.gov Identifier: NCT02340000.', 'mdMesh': '', 'mdPMID': '29738561', 'mdPublicationType': ['Journal Article'], 'mdAuthor': ['Matho A', 'Mulqueen M', 'Tanino M', 'Quidort A', 'Cheung J', 'Pollard J', 'Rodriguez J', 'Swamy S', 'Tayler B', 'Garrison G', 'Ata A', 'Sorum P'], 'mdDataPublished': '2018', 'mdPMC': '', 'mdSI': ['ClinicalTrials.gov/NCT02340000'], 'mdAID': ['10.1371/journal.pone.0196734 [doi]', 'PONE-D-17-43190 [pii]'], 'mdDOI': ['10.1371/journal.pone.0196734 [doi]', 'PONE-D-17-43190 [pii]'], 'mdSO': 'PLoS One. 2018 May 8;13(5):e0196734. doi: 10.1371/journal.pone.0196734. eCollection 2018.', 'mdLanguage': ['English']}

dictString = ', '.join("{!s}={!r}".format(key,val) for (key,val) in attributes.items())
for each in dictString.split(' '):
    if re.match(r'(NCT)\d{8}', each):
        print (each.strip('.\','))
但是,PubMed/MEDLINE也包含。我也想得到这些身份证。我怎样才能比多写40条正则表达式更有效呢


注意:为了澄清,我需要识别每个ID和每个ID的主体。(即NCT01435343的ClinicalTrials.Gov或ACTRN12616000470493的澳大利亚-新西兰临床试验注册中心)

我还没有查看过一堆,以了解是否适用相同的模式,但如果它们总是在html
标记内显示“试验注册号”的文本,您可以解析实际html文档中包含此术语的标签,然后从标签中获取以下段落中的文本。使这一点相对简单

但同样,您只展示了一个示例。我不知道它是否总是遵循这种模式。从这里开始,它们看起来是分号分隔的,很容易拆分