Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/81.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何在复制函数中使用两个条件?_R - Fatal编程技术网

R 如何在复制函数中使用两个条件?

R 如何在复制函数中使用两个条件?,r,R,我有一个data.frame,包含这个数据和10列 ID | sequence | modification| ... | nºproject DAT | atggggg | NULL | ... | project DAT | atggggg | 7.UN | ... | project DAT | actgat | NULL | ... | project DAT | atgtagtt | NULL | ... | pr

我有一个data.frame,包含这个数据和10列

ID  | sequence | modification| ... | nºproject
DAT | atggggg  | NULL        | ... | project 
DAT | atggggg  | 7.UN        | ... | project 
DAT | actgat   | NULL        | ... | project 
DAT | atgtagtt | NULL        | ... | project 
DAT | ttttaaat | 8.UN        | ... | project 
DAT | tatatccc | NULL        | ... | project 
DAT | atagattg | 9.AT        | ... | project 
DAT | atatagag | NULL        | ... | project 
DAT | gggatgac | NULL        | ... | project 
我一直在用这些代码寻找重复的代码

data_table <- data.table(new_data_frame_PEP$sequence, new_data_frame_PEP$modifications)
colnames(data_table) <- c("sequence","modifications")

data_duplicate <- data_table[sequence %in% data_table[duplicated(data_table$sequence),]$sequence]

有没有办法使用列“序列”和列“修改”在复制函数中使用两个条件?

如果
新数据帧\u PEP
是一个数据帧,并且您想要检索在
序列中有重复项的行,您可以使用:

res <- new_data_frame_PEP[duplicated(new_data_frame_PEP$sequence) |
                          duplicated(new_data_frame_PEP$sequence, fromLast=TRUE),]
为了举例说明,我们创建了一个数据集,该数据集就是您发布的数据集,只是我们只包括
ID
sequence
modification
n\u project
列。此外,我们复制了第一行,因此实际上在
序列
修改
中都有重复项:

new_data_frame_PEP <- structure(list(ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = "DAT ", class = "factor"), sequence = structure(c(4L, 
4L, 4L, 1L, 5L, 8L, 7L, 2L, 3L, 6L), .Label = c(" actgat   ", 
" atagattg ", " atatagag ", " atggggg  ", " atgtagtt ", " gggatgac ", 
" tatatccc ", " ttttaaat "), class = "factor"), modification = structure(c(4L, 
4L, 1L, 4L, 4L, 2L, 4L, 3L, 4L, 4L), .Label = c(" 7.UN  ", " 8.UN  ", 
" 9.AT  ", " NULL  "), class = "factor"), n_project = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = " project ", class = "factor")), .Names = c("ID", 
"sequence", "modification", "n_project"), class = "data.frame", row.names = c(NA, 
-10L))
##    ID   sequence modification n_project
##1  DAT   atggggg         NULL    project 
##2  DAT   atggggg         NULL    project 
##3  DAT   atggggg         7.UN    project 
##4  DAT   actgat          NULL    project 
##5  DAT   atgtagtt        NULL    project 
##6  DAT   ttttaaat        8.UN    project 
##7  DAT   tatatccc        NULL    project 
##8  DAT   atagattg        9.AT    project 
##9  DAT   atatagag        NULL    project 
##10 DAT   gggatgac        NULL    project 
seq.mod <- subset(new_data_frame_PEP, select=c("sequence","modification"))
data_duplicate <- new_data_frame_PEP[duplicated(seq.mod) | duplicated(seq.mod, fromLast=TRUE),]
##   ID   sequence modification n_project
##1 DAT   atggggg         NULL    project 
##2 DAT   atggggg         NULL    project 
同时使用
序列
修改

new_data_frame_PEP <- structure(list(ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = "DAT ", class = "factor"), sequence = structure(c(4L, 
4L, 4L, 1L, 5L, 8L, 7L, 2L, 3L, 6L), .Label = c(" actgat   ", 
" atagattg ", " atatagag ", " atggggg  ", " atgtagtt ", " gggatgac ", 
" tatatccc ", " ttttaaat "), class = "factor"), modification = structure(c(4L, 
4L, 1L, 4L, 4L, 2L, 4L, 3L, 4L, 4L), .Label = c(" 7.UN  ", " 8.UN  ", 
" 9.AT  ", " NULL  "), class = "factor"), n_project = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = " project ", class = "factor")), .Names = c("ID", 
"sequence", "modification", "n_project"), class = "data.frame", row.names = c(NA, 
-10L))
##    ID   sequence modification n_project
##1  DAT   atggggg         NULL    project 
##2  DAT   atggggg         NULL    project 
##3  DAT   atggggg         7.UN    project 
##4  DAT   actgat          NULL    project 
##5  DAT   atgtagtt        NULL    project 
##6  DAT   ttttaaat        8.UN    project 
##7  DAT   tatatccc        NULL    project 
##8  DAT   atagattg        9.AT    project 
##9  DAT   atatagag        NULL    project 
##10 DAT   gggatgac        NULL    project 
seq.mod <- subset(new_data_frame_PEP, select=c("sequence","modification"))
data_duplicate <- new_data_frame_PEP[duplicated(seq.mod) | duplicated(seq.mod, fromLast=TRUE),]
##   ID   sequence modification n_project
##1 DAT   atggggg         NULL    project 
##2 DAT   atggggg         NULL    project 
seq.mod
seq.mod <- subset(new_data_frame_PEP, select=c("sequence","modification"))
data_duplicate <- new_data_frame_PEP[duplicated(seq.mod) | duplicated(seq.mod, fromLast=TRUE),]
##   ID   sequence modification n_project
##1 DAT   atggggg         NULL    project 
##2 DAT   atggggg         NULL    project