Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用rpy2修改r对象_Python_R_Bioinformatics_Rpy2_Bioconductor - Fatal编程技术网

Python 使用rpy2修改r对象

Python 使用rpy2修改r对象,python,r,bioinformatics,rpy2,bioconductor,Python,R,Bioinformatics,Rpy2,Bioconductor,我试图使用rpy2来使用python中的DESeq2R/Bioconductor包 实际上,我在编写问题时解决了我的问题(使用do_slots允许访问r objects属性),但我认为该示例可能对其他人有用,下面是我在r中的做法以及在python中的翻译: 在R 我可以从以下两个数据帧创建“DESeqDataSet”: counts_data <- read.table("long/path/to/file", header=TRUE,

我试图使用
rpy2
来使用python中的
DESeq2
R/Bioconductor包

实际上,我在编写问题时解决了我的问题(使用
do_slots
允许访问r objects属性),但我认为该示例可能对其他人有用,下面是我在r中的做法以及在python中的翻译:

在R 我可以从以下两个数据帧创建“DESeqDataSet”:

counts_data <- read.table("long/path/to/file",
                           header=TRUE, row.names="gene")
head(counts_data)
##       WT_RT_1 WT_RT_2 prg1_RT_1 prg1_RT_2
## aap-1     406     311        41        95
## aat-1       5       8         2         0
## aat-2       1       1         0         0
## aat-3      13      12         0         1
## aat-4       6       6         2         3
## aat-5       3       1         1         0

col_data <- DataFrame(lib = c("WT", "WT", "prg1", "prg1"),
                      treat = c("RT", "RT", "RT", "RT"),
                      rep = c("1", "2", "1", "2"), 
                      row.names = colnames(counts_data))
head(col_data)
## DataFrame with 4 rows and 3 columns
##                   lib       treat         rep
##           <character> <character> <character>
## WT_RT_1            WT          RT           1
## WT_RT_2            WT          RT           2
## prg1_RT_1        prg1          RT           1
## prg1_RT_2        prg1          RT           2
dds <- DESeqDataSetFromMatrix(countData = counts_data,
                              colData = col_data,
                              design = ~ lib)
## Warning message:
## In DESeqDataSet(se, design = design, ignoreRank) :
## some variables in design formula are characters, converting to factors

dds
## class: DESeqDataSet 
## dim: 18541 4 
## metadata(1): version
## assays(1): counts
## rownames(18541): aap-1 aat-1 ... WBGene00255550 WBGene00255553
## rowData names(0):
## colnames(4): WT_RT_1 WT_RT_2 prg1_RT_1 prg1_RT_2
## colData names(3): lib treat rep
然后我可以运行分析:

dds <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing

res <- results(dds)
在IPython(我实际运行了前面的命令)中,我可以使用
do_slots
查看对象内部,以尝试确定需要重新校准的因素:

In [229]: tuple(dds.do_slot("colData").slotnames())
Out[229]: ('rownames', 'nrows', 'listData', 'elementType', 'elementMetadata', 'metadata')

In [230]: dds.do_slot("colData").do_slot("listData")
Out[230]: 
R object with classes: ('list',) mapped to:
<ListVector - Python:0x7f2ae2590a08 / R:0x108fcdd0>
[FactorVector, FactorVector, FactorVector]
  lib: <class 'rpy2.robjects.vectors.FactorVector'>
  R object with classes: ('factor',) mapped to:
<FactorVector - Python:0x7f2ae20f1c08 / R:0x136a3920>
[       2,        2,        1,        1]
  rep: <class 'rpy2.robjects.vectors.FactorVector'>
  R object with classes: ('factor',) mapped to:
<FactorVector - Python:0x7f2a9600c948 / R:0x136a30f0>
[       1,        1,        2,        2]
  treat: <class 'rpy2.robjects.vectors.FactorVector'>
  R object with classes: ('factor',) mapped to:
<FactorVector - Python:0x7f2a9600ccc8 / R:0x136a3588>
[       1,        1,        1,        1]
然后我运行分析部分:

In [233]: dds = deseq2.DESeq(dds)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: estimating size factors

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: estimating dispersions

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: gene-wise dispersion estimates

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: mean-dispersion relationship

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: final dispersion estimates

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: fitting model and testing

  warnings.warn(x, RRuntimeWarning)

In [234]: res = pandas2ri.ri2py(as_df(deseq2.results(dds)))

In [235]: res.index.names = ["gene"]

dds = deseq2.DESeq(dds)
res = pandas2ri.ri2py(as_df(deseq2.results(dds)))
res.index.names = ["gene"]
现在,检查测试基因的结果:

In [236]: res.loc["his-10"]
Out[236]: 
baseMean          5.865464e+02
log2FoldChange    3.136174e+00
lfcSE             2.956132e-01
stat              1.060904e+01
pvalue            2.705026e-26
padj              8.787850e-25
Name: his-10, dtype: float64

python返回的结果与R相同。

我在
rpy2
文档中找到了帮助我解决问题的代码示例:

可以通过
do\u slots
方法访问r对象的属性,该方法以属性名称作为参数。有关完整解决方案,请参见问题中的

编辑:

还有一种
do\u slot\u assign
方法,可用于更改设计公式:

>>> dds.do_slot("design").r_repr()
'~lib'
>>> dds.do_slot_assign("design", Formula("~ treat"))
>>> dds.do_slot("design").r_repr()
'~treat'
医生转移到了医院
In [233]: dds = deseq2.DESeq(dds)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: estimating size factors

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: estimating dispersions

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: gene-wise dispersion estimates

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: mean-dispersion relationship

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: final dispersion estimates

  warnings.warn(x, RRuntimeWarning)
/home/bli/.local/lib/python3.6/site-packages/rpy2/rinterface/__init__.py:186: RRuntimeWarning: fitting model and testing

  warnings.warn(x, RRuntimeWarning)

In [234]: res = pandas2ri.ri2py(as_df(deseq2.results(dds)))

In [235]: res.index.names = ["gene"]

dds = deseq2.DESeq(dds)
res = pandas2ri.ri2py(as_df(deseq2.results(dds)))
res.index.names = ["gene"]
In [236]: res.loc["his-10"]
Out[236]: 
baseMean          5.865464e+02
log2FoldChange    3.136174e+00
lfcSE             2.956132e-01
stat              1.060904e+01
pvalue            2.705026e-26
padj              8.787850e-25
Name: his-10, dtype: float64
>>> dds.do_slot("design").r_repr()
'~lib'
>>> dds.do_slot_assign("design", Formula("~ treat"))
>>> dds.do_slot("design").r_repr()
'~treat'