SPARQL查询形成

SPARQL查询形成,sparql,rdf,turtle-rdf,rdf4j,Sparql,Rdf,Turtle Rdf,Rdf4j,我有RDF数据,我想形成一个SPARQL查询来获取与特定有机体名称匹配的记录 仅供参考,我使用RDF4J使用可用的JSONLD数据生成RDF记录。 我在获取与任何特定PropertyValue集匹配的记录时遇到问题。示例:所有记录的有机体为Equus caballus,或所有记录的提交标识符为GSB-7331 非常感谢您的帮助 数据记录如下: @prefix schema: <http://schema.org/> . @prefix obo: <http://purl.obo

我有RDF数据,我想形成一个SPARQL查询来获取与特定有机体名称匹配的记录

仅供参考,我使用RDF4J使用可用的JSONLD数据生成RDF记录。 我在获取与任何特定PropertyValue集匹配的记录时遇到问题。示例:所有记录的有机体为Equus caballus,或所有记录的提交标识符为GSB-7331

非常感谢您的帮助

数据记录如下:

@prefix schema: <http://schema.org/> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix ebi-bsd: <https://www.ebi.ac.uk/biosamples/> .
@prefix biosamples: <http://identifiers.org/biosample/> .

biosamples:SAMEA104496657 a schema:DataRecord ;
schema:dateCreated "0002-10-15T00:00:00Z"^^schema:Date ;
schema:dateModified "2019-07-23T18:33:14.867Z"^^schema:Date ;
schema:identifier "SAMEA104496657" ;
schema:isPartOf ebi-bsd:samples ;
schema:mainEntity _:b0 .

ebi-bsd:samples a schema:Dataset .

_:b0 a schema:Sample , obo:OBI_0000747 ;
schema:additionalProperty _:b1 , _:b2 , _:b3 , _:b4 ;
schema:description "Blood samples N123" ;
schema:identifier "SAMEA104496657" ;
schema:name "N123" ;
schema:sameAs biosamples:SAMEA104496657 .

_:b1 a schema:PropertyValue ;
schema:name "organism" ;
schema:value "Equus caballus" ;
schema:valueReference obo:NCBITaxon_9796 .

obo:NCBITaxon_9796 a schema:DefinedTerm .

_:b2 a schema:PropertyValue ;
schema:name "submission description" ;
schema:value "ELOAD_294_samples" .

_:b3 a schema:PropertyValue ;
schema:name "submission identifier" ;
schema:value "GSB-7331" .

_:b4 a schema:PropertyValue ;
schema:name "submission title" ;
schema:value "ELOAD_294" .
@prefix schema: <http://schema.org/> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix ebi-bsd: <https://www.ebi.ac.uk/biosamples/> .
@prefix biosamples: <http://identifiers.org/biosample/> .

biosamples:SAMEA104625758 a schema:DataRecord ;
schema:dateCreated "0014-06-07T00:00:00Z"^^schema:Date ;
schema:dateModified "2019-08-06T17:46:01.812Z"^^schema:Date ;
schema:identifier "SAMEA104625758" ;
schema:isPartOf ebi-bsd:samples ;
schema:mainEntity _:b0 .

ebi-bsd:samples a schema:Dataset .

_:b0 a schema:Sample , obo:OBI_0000747 ;
schema:additionalProperty _:b1 , _:b2 , _:b3 ;
schema:description "Colorectal Cancer Tumor Sequenced Samaple;      
schema:identifier "SAMEA104625758" ;
schema:name "P-0009062-T01-IM5" ;
schema:sameAs biosamples:SAMEA104625758 ;
schema:subjectOf "http://www.ebi.ac.uk/ena/data/view/SAMEA104625758" .

:b1 a schema:PropertyValue ;
schema:name "common name" ;
schema:value "Human" ;
schema:valueReference obo:NCBITaxon_9606 .

obo:NCBITaxon_9606 a schema:DefinedTerm .

_:b2 a schema:PropertyValue ;
schema:name "organism" ;
schema:value "Homo sapiens" ;
schema:valueReference obo:NCBITaxon_9606 .

_:b3 a schema:PropertyValue ;
schema:name "scientific name" ;
schema:value "Homo sapiens" ;
schema:valueReference obo:NCBITaxon_9606 .
@前缀架构:。
@前缀obo:。
@前缀ebi bsd:。
@前缀biosamples:。
生物样本:SAMEA104496657模式:数据记录;
schema:dateCreated“0002-10-15T00:00:00Z”^^^ schema:Date;
模式:日期修改“2019-07-23T18:33:14.867Z”^^^模式:日期;
模式:标识符“SAMEA104496657”;
模式:isPartOf ebi bsd:样本;
架构:maintentity u2;:b0。
ebi bsd:对schema:Dataset进行采样。
_:b0a模式:示例,海外建筑运营管理局:OBI_0000747;
模式:附加属性u1:b1,2:b2,3:b3,3:b4;
模式:描述“血样N123”;
模式:标识符“SAMEA104496657”;
模式:名称为“N123”;
模式:sameAs生物样本:SAMEA104496657。
_:b1架构:PropertyValue;
模式:命名为“有机体”;
模式:值“Equus caballus”;
schema:valueReference-obo:ncbitaxon9796。
海外建筑运营管理局:NCBITAXON9796A模式:定义术语。
_:b2 a模式:PropertyValue;
模式:名称“提交描述”;
模式:值“ELOAD_294_samples”。
_:b3模式:PropertyValue;
模式:名称“提交标识符”;
模式:值“GSB-7331”。
_:b4模式:PropertyValue;
模式:名称“提交标题”;
模式:值“ELOAD_294”。
@前缀架构:。
@前缀obo:。
@前缀ebi bsd:。
@前缀biosamples:。
生物样本:SAMEA104625758模式:数据记录;
模式:dateCreated“0014-06-07T00:00:00Z”^^^模式:日期;
模式:日期修改“2019-08-06T17:46:01.812Z”^^^模式:日期;
模式:标识符“SAMEA104625758”;
模式:isPartOf ebi bsd:样本;
架构:maintentity u2;:b0。
ebi bsd:对schema:Dataset进行采样。
_:b0a模式:示例,海外建筑运营管理局:OBI_0000747;
模式:附加属性u1:b1,2:b2,3:b3;
模式:描述“结直肠癌肿瘤测序样本;
模式:标识符“SAMEA104625758”;
模式:名称“P-0009062-T01-IM5”;
模式:sameAs生物样本:SAMEA104625758;
模式:subjectOf“http://www.ebi.ac.uk/ena/data/view/SAMEA104625758" .
:b1架构:PropertyValue;
模式:名称为“公共名称”;
图式:价值“人”;
schema:valueReference-obo:ncbitaxon9606。
海外建筑运营管理局:NCBITaxon_9606模式:定义术语。
_:b2 a模式:PropertyValue;
模式:命名为“有机体”;
模式:价值观“智人”;
schema:valueReference-obo:ncbitaxon9606。
_:b3模式:PropertyValue;
模式:名称“科学名称”;
模式:价值观“智人”;
schema:valueReference-obo:ncbitaxon9606。
我用来生成RDF海龟数据的代码如下:, 我从以下位置下载JSONLD中的示例数据-

import org.apache.commons.io.FileUtils;
导入org.eclipse.rdf4j.model.Statement;
导入org.eclipse.rdf4j.rio.RDFFormat;
导入org.eclipse.rdf4j.rio.RDFHandlerException;
导入org.eclipse.rdf4j.rio.RDFParser;
导入org.eclipse.rdf4j.rio.rio;
导入org.eclipse.rdf4j.rio.helpers.StatementCollector;
导入org.slf4j.Logger;
导入org.slf4j.LoggerFactory;
导入java.io.ByteArrayInputStream;
导入java.io.File;
导入java.io.InputStream;
导入java.io.StringWriter;
导入java.net.HttpURLConnection;
导入java.net.URL;
导入java.nio.charset.StandardCharset;
导入java.util.Collection;
导入java.util.Scanner;
导入java.util.concurrent.Callable;
公共类BioSchemasRdfGenerator实现可调用{
私有记录器log=LoggerFactory.getLogger(getClass());
私有静态文件;
私有静态长样本计数=0;
私有最终URL;
公共静态void setFilePath(字符串filePath){
文件=新文件(文件路径);
}
BioSchemasRdfGenerator(最终URL){
log.info(“处理”+url.toString()+”,当前样本计数为:“+++sampleCount”);
this.url=url;
}
@凌驾
public Void call()引发异常{
requestHTTPAndHandle(this.url);
返回null;
}
私有静态void requestHTTPAndHandle(最终URL)引发异常{
最终的HttpURLConnection conn=(HttpURLConnection)url.openConnection();
int响应;
试一试{
conn.setRequestMethod(“GET”);
连接();
response=conn.getResponseCode();
如果(响应==200){
handleSuccessResponses(url);
}
}捕获(最终异常e){
抛出新的运行时异常(e);
}最后{
连接断开();
}
}
私有静态void handleSuccessResponses(最终URL){
尝试(Scanner sc=new Scanner(url.openStream())){
最终StringBuilder sb=新StringBuilder();
while(sc.hasNext()){
sb.append(sc.nextLine());
}
try(InputStream in=newbytearrayinputstream(sb.toString().getBytes(StandardCharsets.UTF_8))){
字符串dataAsRdf=readRdfToString(in);
写入(dataAsRdf);
}捕获(最终异常e){
抛出新的运行时异常(e);
}
}捕获(最终异常e){
抛出新的运行时异常(e);
}
}
@SuppressWarnings(value=“弃用”)
私有静态void write(最终字符串sampleData)引发异常{
writeStringToFile(文件,sampleData,true);
}
/**
*rdf输入流中的@param
*@返回字符串表示形式
*/
私有静态字符串readRdfToString(最终输入流输入){
返回graphToString(readRdfToGraph(in));
}
/**
*@param inputStream包含rdf数据的输入流
*@返回表示输入流中rdf的图形
*/
私有静态集合readRdfToGraph(最终InputStream InputStream){
试一试{
最终RDFParser
import org.apache.commons.io.FileUtils;
import org.eclipse.rdf4j.model.Statement;
import org.eclipse.rdf4j.rio.RDFFormat;
import org.eclipse.rdf4j.rio.RDFHandlerException;
import org.eclipse.rdf4j.rio.RDFParser;
import org.eclipse.rdf4j.rio.Rio;
import org.eclipse.rdf4j.rio.helpers.StatementCollector;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.InputStream;
import java.io.StringWriter;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.Collection;
import java.util.Scanner;
import java.util.concurrent.Callable;

public class BioSchemasRdfGenerator implements Callable<Void> {
    private Logger log = LoggerFactory.getLogger(getClass());
    private static File file;
    private static long sampleCount = 0;
    private final URL url;

    public static void setFilePath(String filePath) {
        file = new File(filePath);
    }

    BioSchemasRdfGenerator(final URL url) {
        log.info("HANDLING " + url.toString() + " and the current sample count is: " + ++sampleCount);

        this.url = url;
    }

    @Override
    public Void call() throws Exception {
        requestHTTPAndHandle(this.url);

        return null;
    }

    private static void requestHTTPAndHandle(final URL url) throws Exception {
        final HttpURLConnection conn = (HttpURLConnection) url.openConnection();
        int response;

        try {
            conn.setRequestMethod("GET");
            conn.connect();
            response = conn.getResponseCode();

            if (response == 200) {
                handleSuccessResponses(url);
            }
        } catch (final Exception e) {
            throw new RuntimeException(e);
        } finally {
            conn.disconnect();
        }
    }

    private static void handleSuccessResponses(final URL url) {
        try (Scanner sc = new Scanner(url.openStream())) {
            final StringBuilder sb = new StringBuilder();

            while (sc.hasNext()) {
                sb.append(sc.nextLine());
            }

            try (InputStream in = new ByteArrayInputStream(sb.toString().getBytes(StandardCharsets.UTF_8))) {
                String dataAsRdf = readRdfToString(in);

                write(dataAsRdf);
            } catch (final Exception e) {
                throw new RuntimeException(e);
            }
        } catch (final Exception e) {
            throw new RuntimeException(e);
        }
    }

    @SuppressWarnings(value = "deprecation")
    private static void write(final String sampleData) throws Exception {
        FileUtils.writeStringToFile(file, sampleData, true);
    }

    /**
     * @param in a rdf input stream
     * @return a string representation
     */
    private static String readRdfToString(final InputStream in) {
        return graphToString(readRdfToGraph(in));
    }

    /**
     * @param inputStream an Input stream containing rdf data
     * @return a Graph representing the rdf in the input stream
     */
    private static Collection<Statement> readRdfToGraph(final InputStream inputStream) {
        try {
            final RDFParser rdfParser = Rio.createParser(RDFFormat.JSONLD);
            final StatementCollector collector = new StatementCollector();

            rdfParser.setRDFHandler(collector);
            rdfParser.parse(inputStream, "");

            return collector.getStatements();
        } catch (final Exception e) {
            throw new RuntimeException(e);
        }
    }

    /**
     * Transforms a graph to a string.
     *
     * @param myGraph a sesame rdf graph
     * @return a rdf string
     */
    private static String graphToString(final Collection<Statement> myGraph) {
        final StringWriter out = new StringWriter();
        final TurtleWriterCustom turtleWriterCustom = new TurtleWriterCustom(out);

        return modifyIdentifier(writeRdfInTurtleFormat(myGraph, out, turtleWriterCustom));
    }

    private static String modifyIdentifier(String rdfString) {
        if (rdfString != null)
            rdfString = rdfString.replaceAll("biosample:", "");

        return rdfString;
    }

    private static String writeRdfInTurtleFormat(Collection<Statement> myGraph, StringWriter out, TurtleWriterCustom writer) {
        try {
            writer.startRDF();
            handleNamespaces(writer);

            for (Statement st : myGraph) {
                writer.handleStatement(st);
                //below line is commented: for short RDF
                //writer.writeValue(st.getObject(),O true);
            }

            writer.endRDF();
        } catch (final RDFHandlerException e) {
            throw new RuntimeException(e);
        }

        return out.getBuffer().toString();
    }

    private static void handleNamespaces(final TurtleWriterCustom writer) {
        writer.handleNamespace("schema", "http://schema.org/");
        writer.handleNamespace("obo", "http://purl.obolibrary.org/obo/");
        writer.handleNamespace("ebi-bsd", "https://www.ebi.ac.uk/biosamples/");
        writer.handleNamespace("biosamples", "http://identifiers.org/biosample/");
    }
}