Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/74.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 基于一列选择数据集的子集_R - Fatal编程技术网

R 基于一列选择数据集的子集

R 基于一列选择数据集的子集,r,R,我有一个包含两列的数据集 text created 1 cant do it with cards either 1/2/2014 2 cant do it with cards either 2/2/2014 3 Coming back home AK 2/2/2014 4

我有一个包含两列的数据集

                           text  created
    1                   cant do it with cards either 1/2/2014
    2                   cant do it with cards either 2/2/2014
    3                            Coming back home AK 2/2/2014
    4                            Coming back home AK 5/2/2014
    5                                 gotta try PNNL 1/2/2014
    6 Me and my Tart would love to flyLoveisintheAir 5/2/2014
    7 Me and my Tart would love to flyLoveisintheAir 6/2/2014
如何根据第一列的唯一字符串获取数据集的子集

                           text  created
    1                   cant do it with cards either 1/2/2014
    3                            Coming back home AK 2/2/2014
    5                                 gotta try PNNL 1/2/2014
    6 Me and my Tart would love to flyLoveisintheAir 5/2/2014


structure(list(text = structure(c(1L, 1L, 2L, 2L, 3L, 4L, 4L), .Label = c("cant do it with cards either", 
"Coming back home AK", "gotta try PNNL", "Me and my Tart would love to flyLoveisintheAir"
), class = "factor"), created = structure(c(1L, 2L, 2L, 3L, 1L, 
3L, 4L), .Label = c("1/2/2014", "2/2/2014", "5/2/2014", "6/2/2014"
), class = "factor")), .Names = c("text", "created"), class = "data.frame", row.names =  c(NA, 
-7L))

有很多可能性:

tab[!duplicated(tab$text),]
# with dplyr
filter(tab, !duplicated(text))

hth

尝试使用重复的
。考虑<代码> df>代码>是您的数据.Fr../P>
> df[!duplicated(df$text), ]
                                            text  created
1                   cant do it with cards either 1/2/2014
3                            Coming back home AK 2/2/2014
5                                 gotta try PNNL 1/2/2014
6 Me and my Tart would love to flyLoveisintheAir 5/2/2014

您好,我遇到了这个错误“到达getOption(“max.print”)--省略了7688行”,当我检查数据集时,文本中仍然有重复的字符串column@user3456230,您是否将输出分配给了任何对象?您得到的警告只是意味着R不会打印控制台中的所有行。不,我没有分配输出。Data_edited_txt[!duplicated(Data_edited_txt$text),]我现在该怎么办?赋值意味着
newData嗨,我遇到了这个错误“到达getOption(“max.print”)--省略了7688行”,当我检查数据集时,文本列中仍然有重复的字符串。我尝试了这两种方法,但行数与原始文件相同,只需将输出保存在另一个对象中:tab2