Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
以向量为参数的RSQLite参数化查询_R_Rsqlite - Fatal编程技术网

以向量为参数的RSQLite参数化查询

以向量为参数的RSQLite参数化查询,r,rsqlite,R,Rsqlite,我不熟悉SQL及其语法,我无法理解如何使用RSQLite在R中的参数化查询中将多个值(例如向量或列表)传递给单个参数 我有一个两表数据库(myTCGA),数据来自RNASeq数据。第一个(tcga_P)包含一些基因样本的表达值(FPKM),而(tcgaMeta)包含这些样本的元数据信息 #tcga_P FPKM Sample Tissue GeneName 5550 0.0633 TCGA-AB-2803-03A Acute_Mye

我不熟悉
SQL
及其语法,我无法理解如何使用
RSQLite
R
中的参数化查询中将多个值(例如向量或列表)传递给单个参数

我有一个两表数据库(
myTCGA
),数据来自RNASeq数据。第一个(
tcga_P
)包含一些基因样本的表达值(
FPKM
),而(
tcgaMeta
)包含这些样本的元数据信息

#tcga_P
      FPKM           Sample                 Tissue GeneName
5550 0.0633 TCGA-AB-2803-03A Acute_Myeloid_Leukemia  PLEKHN1
5551 0.2390 TCGA-AB-2805-03A Acute_Myeloid_Leukemia  PLEKHN1
5552 0.0253 TCGA-AB-2806-03A Acute_Myeloid_Leukemia  PLEKHN1
5553 0.0385 TCGA-AB-2807-03A Acute_Myeloid_Leukemia  PLEKHN1
5554 0.0326 TCGA-AB-2808-03A Acute_Myeloid_Leukemia  PLEKHN1
5555 0.2836 TCGA-AB-2810-03A Acute_Myeloid_Leukemia  PLEKHN1

# tcgaMeta (only few columns)
SampleIndex         SampleID        SubjectID Tumor.Type         Sample.Type
  1           0 TCGA-01-0628-11A TCGA-01-0628         OV Solid Tissue Normal
  2           1 TCGA-01-0630-11A TCGA-01-0630         OV Solid Tissue Normal
  3           2 TCGA-01-0631-11A TCGA-01-0631         OV Solid Tissue Normal
我只想从属于特定组的样本(例如所有肺样本)的
tcga_p
中提取表达值。为此,我编写了一个如下的查询

library(DBI)
library(RSQLite)
library(data.table)

myGene <- "PLEKHN1"
myTissue <- "lung"
myCancer <- "Lung Adenocarcinoma"
selectedSamples <- dbGetQuery(myTCGA, 
     "SELECT A.*
     FROM tcga_P A 
     WHERE A.GeneName = $gene AND
           A.Sample in (SELECT B.SampleID FROM tcgaMeta B 
                WHERE B.Tissue = $tissue AND 
                      B.`Disease.TCGA.` = $cancer )
     ",param = list(gene=myGene,tissue=myTissue,cancer=myCancer))
 # from long to wide
 selectedSamplesWide <- dcast(selectedSamples,GeneName~Sample, value.var = "FPKM",fun.aggregate = sum)
我可以循环(
sapply
)通过载体中的基因,一次调用一个基因并将结果绑定在一起,但我想在sql调用中做所有事情

到目前为止我试过

WHERE A.GeneName IN ($gene)
WHERE A.GeneName IN (SELECT C.GeneName FROM $gene C)
我还尝试将
myGene
转换为
data.frame
,并将基因作为列处理。说什么都没用是多余的


我错过了什么?如何将参数传递给
param=list()

在SQL中,具有等式的
WHERE
子句,
=
,在尝试将表达式求值为两个值时,需要一个值。但是,中带有
WHERE
允许多个值:

其中A.GeneName位于('PLEKHN1','PSMD12',…)

对于一个不限数量的值,考虑动态创建准备好的语句,使用<代码>粘贴>代码> +>代码>塌陷< /代码>,绑定参数值与<代码>设置名称和<代码> AS .List >:

myGene <- c("PLEKHN1", "PSMD12")
myTissue <- "lung"
myCancer <- "Lung Adenocarcinoma"

myPlaceHolders <- paste0("$gene", seq_along(myGene))

sql <- paste0("SELECT A.*
               FROM tcga_P A 
               WHERE A.GeneName IN (", paste(myPlaceHolders, collapse=", "), ") 
                 AND A.Sample in (SELECT B.SampleID 
                                  FROM tcgaMeta B 
                                  WHERE B.Tissue = $tissue 
                                    AND B.`Disease.TCGA.` = $cancer)
             ")

myGeneParams <- as.list(setNames(myGene, gsub("\\$", "", myPlaceHolders)))
paramList <- c(myGeneParams, tissue=myTissue, cancer=myCancer)

selectedSamples <- dbGetQuery(myTCGA, sql, param = myParamList)

myGene感谢您提供的解决方案和清晰的解释。我不得不避开gsub模式
gsub(\\$,“”,myPlaceHolders)
,但除此之外,它还起了作用。很高兴听到这个消息并乐于提供帮助!
myGene <- c("PLEKHN1", "PSMD12")
myTissue <- "lung"
myCancer <- "Lung Adenocarcinoma"

myPlaceHolders <- paste0("$gene", seq_along(myGene))

sql <- paste0("SELECT A.*
               FROM tcga_P A 
               WHERE A.GeneName IN (", paste(myPlaceHolders, collapse=", "), ") 
                 AND A.Sample in (SELECT B.SampleID 
                                  FROM tcgaMeta B 
                                  WHERE B.Tissue = $tissue 
                                    AND B.`Disease.TCGA.` = $cancer)
             ")

myGeneParams <- as.list(setNames(myGene, gsub("\\$", "", myPlaceHolders)))
paramList <- c(myGeneParams, tissue=myTissue, cancer=myCancer)

selectedSamples <- dbGetQuery(myTCGA, sql, param = myParamList)