Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/364.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Notepad++ 如何使用记事本多次提取句子中两个字符串之间的文本++;_Notepad++ - Fatal编程技术网

Notepad++ 如何使用记事本多次提取句子中两个字符串之间的文本++;

Notepad++ 如何使用记事本多次提取句子中两个字符串之间的文本++;,notepad++,Notepad++,我必须在两个文本之间提取文本 </cons> and <con 我想要的输出是 interacts with of on and contributes to on in using which recognize different triggering delivers signals capable of activating the which is required for or could all trigger activation of th

我必须在两个文本之间提取文本

</cons> and <con
我想要的输出是

interacts with 
of 
on 
and contributes to
on 
in 
using 
which recognize different 
triggering
delivers signals capable of activating the
which is required for 
or 
could all trigger activation of the 
and
我试过正则表达式

 .*<\/cons>(.*?)<cons.*  and replace with with $1
*(*)
  • 转到记事本++中的搜索-->替换
  • 选择搜索模式作为正则表达式
  • 在查找将正则表达式放置为“]+>”的字段中,在替换为字段放置空格中,单击全部替换
  • 它将用空格替换所有xml标记(您也可以将换行符放在替换为字段中)

    它将给您留下字符串:-

    CD4共受体与抗原呈递细胞上主要组织相容性复合体II类分子的非多态性区域相互作用,并促进T细胞活化。 我们使用识别不同CD4表位的单克隆抗体(mAb)在淋巴瘤模型中研究了CD4触发对T细胞激活信号的影响。 我们证明CD4触发传递的信号能够激活白细胞介素-2基因表达所需的NF-AT转录因子。 尽管不同的抗CD4单克隆抗体或HIV-1 gp120都能激活蛋白酪氨酸激酶p56lck和p59fyn以及介导Ras信号的Shc衔接蛋白的磷酸化,但它们激活NF-AT的能力存在显著差异。 NF-AT缺乏完全激活可能与诱导钙通量的能力显著降低有关,并可与钙离子载体互补。 结果确定了CD4协同受体上参与激活Ras/蛋白激酶C和钙通道的功能不同的表位


    我希望有帮助

    使用正则表达式解析XML很困难。最好使用XML解析器。下面的Python 3 SAX内容解析器跟踪解析
    结束标记(
    self.state=1
    )的时间,如果紧接着是文本内容(
    self.state=2
    ),然后紧接着是
    cons
    开始元素。如果是,则打印内容:

    import xml.sax
    
    data = b'''\
    <abstract>
    <sentence>The <cons lex="CD4_coreceptor" sem="G#protein_molecule">CD4 coreceptor</cons> interacts with <cons lex="non-polymorphic_region" sem="G#protein_domain_or_region">non-polymorphic regions</cons> of <cons lex="major_histocompatibility_complex_class_II_molecule" sem="G#protein_family_or_group">major histocompatibility complex class II molecules</cons> on <cons lex="antigen-presenting_cell" sem="G#cell_type">antigen-presenting cells</cons> and contributes to <cons lex="T_cell_activation" sem="G#other_name">T cell activation</cons>.</sentence>
    <sentence>We have investigated the effect of <cons lex="CD4_triggering" sem="G#other_name"><cons lex="CD4" sem="G#protein_molecule">CD4</cons> triggering</cons> on <cons lex="T_cell_activating_signal" sem="G#other_name">T cell activating signals</cons> in a <cons lex="lymphoma_model" sem="G#other_name">lymphoma model</cons> using <cons lex="monoclonal_antibody" sem="G#protein_family_or_group">monoclonal antibodies</cons> (<cons lex="mAb" sem="G#protein_domain_or_region">mAb</cons>) which recognize different <cons lex="CD4_epitope" sem="G#protein_family_or_group">CD4 epitopes</cons>.</sentence>
    <sentence>We demonstrate that <cons lex="CD4_triggering" sem="G#other_name"><cons lex="CD4" sem="G#protein_molecule">CD4</cons> triggering</cons> delivers signals capable of activating the <cons lex="NF-AT_transcription_factor" sem="G#protein_molecule">NF-AT transcription factor</cons> which is required for <cons lex="interleukin-2_gene_expression" sem="G#other_name"><cons lex="interleukin-2" sem="G#protein_molecule">interleukin-2</cons> gene expression</cons>.</sentence>
    <sentence>Whereas different <cons lex="anti-CD4_mAb" sem="G#protein_family_or_group">anti-CD4 mAb</cons> or <cons lex="HIV-1_gp120" sem="G#protein_molecule"><cons lex="HIV-1" sem="G#virus">HIV-1</cons> gp120</cons> could all trigger activation of the <cons lex="protein_tyrosine_kinase" sem="G#protein_family_or_group">protein tyrosine kinases</cons> <cons lex="p56lck" sem="G#protein_molecule">p56lck</cons> and <cons lex="p59fyn" sem="G#protein_molecule">p59fyn</cons> and phosphorylation of the <cons lex="Shc_adaptor_protein" sem="G#protein_molecule">Shc adaptor protein</cons>, which mediates signals to <cons lex="Ras" sem="G#protein_family_or_group">Ras</cons>, they differed significantly in their ability to activate <cons lex="NF-AT" sem="G#protein_molecule">NF-AT</cons>.</sentence>
    <sentence>Lack of full activation of <cons lex="NF-AT" sem="G#protein_molecule">NF-AT</cons> could be correlated to a dramatically reduced capacity to induce <cons lex="calcium_flux" sem="G#other_name"><cons lex="calcium" sem="G#atom">calcium</cons> flux</cons> and could be complemented with a <cons lex="calcium_ionophore" sem="G#other_organic_compound">calcium ionophore</cons>.</sentence>
    <sentence>The results identify functionally distinct <cons lex="epitope" sem="G#protein_family_or_group">epitopes</cons> on the <cons lex="CD4_coreceptor" sem="G#protein_molecule">CD4 coreceptor</cons> involved in activation of the <cons lex="Ras/protein_kinase_C_and_calcium_pathway" sem="G#other_name"><cons lex="Ras/protein_kinase_C" sem="G#protein_molecule"><cons lex="Ras/protein_kinase_C_pathway" sem="G#other_name"><cons lex="Ras" sem="G#protein_molecule">Ras</cons><cons lex="protein_kinase_C" sem="G#protein_molecule">/protein kinase C</cons></cons></cons> and <cons lex="calcium_pathway" sem="G#other_name">calcium pathways</cons></cons>.</sentence>
     </abstract>'''
    
    class Handler(xml.sax.ContentHandler):
    
        def __init__(self):
            xml.sax.ContentHandler.__init__(self)
            self.state = 0
            self.content = ''
    
        def characters(self,content):
            if self.state == 1:
                self.content = content
                self.state = 2
            else:
                self.state = 0
    
        def startElement(self,name,attr):
            if name == 'cons' and self.state == 2:
                print(self.content)
            self.state = 0
    
        def endElement(self,name):
            if name == 'cons':
                self.state = 1
            else:
                self.state = 0
    
    xml.sax.parseString(data,Handler())
    
    下面是我在Notepad++中使用正则表达式所能做的最好的事情。在最后一次替换后,它将处理除文本以外的所有内容:

    输出:

     interacts with 
     of 
     on 
     and contributes to 
     on 
     in a 
     using 
     (
    ) which recognize different 
     delivers signals capable of activating the 
     which is required for 
     or 
     could all trigger activation of the 
    
     and 
     and phosphorylation of the 
    , which mediates signals to 
    , they differed significantly in their ability to activate 
     could be correlated to a dramatically reduced capacity to induce 
     and could be complemented with a 
     on the 
     involved in activation of the 
     and 
    
     interacts with  of  on  and contributes to  on  in a  using  () which recognize different  delivers signals capable of activating the  which is required for  or  could all trigger activation of the   and  and phosphorylation of the , which mediates signals to , they differed significantly in their ability to activate  could be correlated to a dramatically reduced capacity to induce  and could be complemented with a  on the  involved in activation of the  and  lex="calcium_pathway" sem="G#other_name">calcium pathways</cons></cons>.</sentence>
     </abstract>
    
    在使用()中与of on相互作用并对on起作用,使用()识别能够激活的不同信号,这些信号是和和的激活所必需的,或可能触发和的激活,并介导信号到,它们在激活能力上存在显著差异,这可能与诱导能力显著降低有关,并且可以通过参与激活和lex=“Cacium_pathway”sem=“G#other_name”>钙途径的一系列研究来补充。
    
    有一种提取数据的简单方法,正如上面在notepad++中提到的那样

    search .*?</cons>([^<]*?)<cons
    replace \1\r\n
    

    search.*([^你想用什么编程语言?如果上面的问题用记事本++解决了,那么我会选择Python。你能先试试Python代码来展示你的视野吗?我很难用Python以如此原始的形式解决上面的问题。我必须在记事本中提取介于和之间的文本++它将修改实际文件。是否要提取和复制其他人的数据?我上面的正则表达式运行良好,但此正则表达式的唯一缺点是它只提取介于和之间的文本。是的,我必须提取此数据,并将进一步用于我的其他问题解决遇到错误类型错误:super()至少接受1个参数(给定0)我更改了代码,使其同时适用于Python 2和Python 3。使用Python 2.7.9和3.3.5.wow进行了测试!它非常有效,还有一件事,如果我通过文件处理来实现这一点,我的意思是,如果我使用整个文件并将结果存储在文件中,那么我必须在上述代码中进行哪些更改??
    
     interacts with  of  on  and contributes to  on  in a  using  () which recognize different  delivers signals capable of activating the  which is required for  or  could all trigger activation of the   and  and phosphorylation of the , which mediates signals to , they differed significantly in their ability to activate  could be correlated to a dramatically reduced capacity to induce  and could be complemented with a  on the  involved in activation of the  and  lex="calcium_pathway" sem="G#other_name">calcium pathways</cons></cons>.</sentence>
     </abstract>
    
    search .*?</cons>([^<]*?)<cons
    replace \1\r\n