Perl 使用Bio:：DB:：EUtilities从pubmed访问摘要_Perl_Xml Parsing_Bioperl_Ncbi_Pubmed

Perl 使用Bio:：DB:：EUtilities从pubmed访问摘要

perl

Perl 使用Bio:：DB:：EUtilities从pubmed访问摘要,perl,xml-parsing,bioperl,ncbi,pubmed,Perl,Xml Parsing,Bioperl,Ncbi,Pubmed,我正在使用Bio:：DB:：EUtilities查询具有给定PMID（Pubmed Id）的Pubmed DB 是否有一种方法可以直接访问对象（例如，抽象），而不是将响应写入文件并使用XML:：Twig等？如果您正在寻找像$factory->get_abstract这样的对象方法，那么它是不存在的。使用esummary将告诉您条目是否有摘要。比如说, #!/usr/bin/env perl use 5.010; use strict; use warnings; use Bio::DB::EU

我正在使用Bio:：DB:：EUtilities查询具有给定PMID（Pubmed Id）的Pubmed DB

是否有一种方法可以直接访问对象（例如，抽象），而不是将响应写入文件并使用XML:：Twig等？

如果您正在寻找像

$factory->get_abstract

这样的对象方法，那么它是不存在的。使用

esummary

将告诉您条目是否有摘要。比如说,

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my @ids = (23298400);
my $factory = Bio::DB::EUtilities->new(-eutil   => 'esummary',
                                       -email   => 'mymail@foo.bar',
                                       -db      => 'pubmed',
                                       -retmode => 'xml',
                                       -id      => \@ids);

while (my $doc = $factory->next_DocSum) {
    while (my $item = $doc->next_Item('flattened')) {
        if ($item->get_name eq 'HasAbstract') {
            printf("%-20s: %s\n",$item->get_name,$item->get_content) if $item->get_content;
        }
    }
}

这只是打印，

HasAbstract:1

。如果你想得到摘要，有几个选择。一种是使用

efetch

返回xml，您可以存储内容，而不是使用

my$xml=$factory->get_Response->content

写入文件，然后在其中查找“Abstract”节点

#!/usr/bin/env perl                                                                                                                                                

use 5.010;
use utf8;
use strict;
use warnings;
use Bio::DB::EUtilities;
use XML::LibXML;

my @ids = (23298400);
my $factory = Bio::DB::EUtilities->new(-eutil   => 'efetch',
                                       -email   => 'mymail@foo.bar',
                                       -db      => 'pubmed',
                                       -retmode => 'xml',
                                       -id      => \@ids);

my $xml = $factory->get_Response->content;

my $xml_parser = XML::LibXML->new();
my $dom = $xml_parser->parse_string($xml);
my $root = $dom->documentElement();

for my $node ($root->findnodes('//*[text()]')) {
    my $name = $node->nodeName();
    if ($name eq 'Abstract') {
        for my $child ($node->findnodes('*')) {
            binmode STDOUT, ":utf8";
            say $child->textContent();
        }
    }
}

这段代码打印了摘要（这与我在上面提供的答案相同，但为了完整起见，将其包含在这里）。另一种选择是在Bash脚本中使用justuse curl，或者在Perl脚本中使用curl自己形成查询。如果您查看了的指南，您可以看到可以将

retmode

设置为“text”，将

rettype

设置为“abstract”。此外，在“示例”部分中，很少有示例说明如何使用pmid形成查询以仅获取摘要文本

BioPerl方法将使您能够访问更多的信息，但您可能需要自己进行一些解析（或阅读API）。或者，如果您感兴趣的话，您可以只获取摘要，但这种方法更为有限，因为您只获取摘要，而不获取与出版物相关的其他信息

#!/usr/bin/env perl                                                                                                                                                

use 5.010;
use utf8;
use strict;
use warnings;
use Bio::DB::EUtilities;
use XML::LibXML;

my @ids = (23298400);
my $factory = Bio::DB::EUtilities->new(-eutil   => 'efetch',
                                       -email   => 'mymail@foo.bar',
                                       -db      => 'pubmed',
                                       -retmode => 'xml',
                                       -id      => \@ids);

my $xml = $factory->get_Response->content;

my $xml_parser = XML::LibXML->new();
my $dom = $xml_parser->parse_string($xml);
my $root = $dom->documentElement();

for my $node ($root->findnodes('//*[text()]')) {
    my $name = $node->nodeName();
    if ($name eq 'Abstract') {
        for my $child ($node->findnodes('*')) {
            binmode STDOUT, ":utf8";
            say $child->textContent();
        }
    }
}