如何在不丢失属性的情况下从data.frame中删除行_R_Statistics

如何在不丢失属性的情况下从data.frame中删除行

r statistics

如何在不丢失属性的情况下从data.frame中删除行,r,statistics,R,Statistics,首先：我在这个问题上搜索了几个小时-所以如果答案应该是琐碎的，请原谅我我要做的是从data.frame中删除一行（no.101）。它包含测试数据，不应出现在我的分析中。我的问题是：每当我从data.frame中删除子集时，属性（尤其是注释）都会丢失 str(x) # x has comments for each variable x <- x[1:100,] str(x) # now x has lost all comments str（x） #x对每个变量都有注释 x如果我理解正

首先：我在这个问题上搜索了几个小时-所以如果答案应该是琐碎的，请原谅我

我要做的是从data.frame中删除一行（no.101）。它包含测试数据，不应出现在我的分析中。我的问题是：每当我从data.frame中删除子集时，属性（尤其是注释）都会丢失

str(x)
# x has comments for each variable
x <- x[1:100,]
str(x)
# now x has lost all comments

str（x）
#x对每个变量都有注释
x如果我理解正确，您在data.frame中有一些数据，并且data.frame的列有与之相关的注释。也许像下面这样
set.seed(1)

mydf<-data.frame(aa=rpois(100,4),bb=sample(LETTERS[1:5],
  100,replace=TRUE))

comment(mydf$aa)<-"Don't drop me!"
comment(mydf$bb)<-"Me either!"

当您将其子集化时，将删除注释：
> str(mydf[1:2,]) # comment dropped.
'data.frame':   2 obs. of  2 variables:
 $ aa: num  3 3
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2

要保留注释，请按照上面的操作（从文档中）定义函数[.avector
，然后向data.frame中的每个列添加适当的类属性（编辑：要保持bb
的因子级别，请将的“因子”
添加到bb
的类中）：
编辑：
如果data.frame中有许多列具有要保留的属性，则可以使用lappy
（编辑的来包括原始列类）：
对于那些寻找基于本巴恩斯解释的“全方位”解决方案的人来说：这就是
（如果本巴恩斯的帖子对你有用，请给出“向上”的答案）
#定义AVEC子选择方法（来自手册）
as.data.frame.avector这是由sticky
包解决的。（完全公开：我是包的作者。）将sticky（）
应用于向量，并通过子集操作保留属性。例如：
> df <- data.frame( 
+   sticky   = sticky( structure(1:5, comment="sticky attribute") ),
+   nonstick = structure( letters[1:5], comment="non-sticky attribute" )
+ )
> 
> comment(df[1:3, "nonstick"])
NULL
> comment(df[1:3, "sticky"])
[1] "sticky attribute"

>df
>注释（df[1:3，“不粘”]）
无效的
>注释（df[1:3，“粘性”]）
[1] “粘性属性”

这适用于任何属性，而不仅仅是注释

有关详细信息，请参阅粘性软件包：



我花了数小时试图弄清楚如何在对数据框进行子集设置（删除列）时保留属性数据（特别是变量标签）。答案很简单，我简直不敢相信。只需使用Hmisc包中的spss.get函数，然后无论如何子集，变量标签都会保留。
也可以设置行NA:x[101，]嗨，BenBarnes！感谢您的回答-给出了您的解释和代码示例，手册中的函数终于对我有意义了！似乎我必须学习一些关于R中的类的知识。我正在尝试使用这种方法。但是，这个操作transformColumn很抱歉混淆-我认为它确实有效。嗯…，在某种程度上，在哪里我可能破坏了代码中的内容。当我弄明白这一点时，我会报告。实际上，错误仍然存在，但是另一个错误：error in storage.mode（unlist（data[“Registration Time”]））@ AleksandrBlekh，你的评论包括在OP或答案中没有提到的代码。因此，更多的信息，包括一个最小的可重复的例子，将会给你最好的帮助。请考虑发布一个新的问题。我已经尝试过这个解决方案。，但我遇到的一个问题是，每次后续的代码运行都会向对象添加avector
类。因此，我最终得到了多个冗余的avector
类属性。此外，选择器函数定义中的I
参数未使用，因此可以删除IMHO。我在read/im中使用此代码移植脚本，然后保存数据集。因此，每个数据帧只运行一次代码。我明白了。我已经解决了上述问题的大部分问题。但是，无论如何，感谢您的回答。sticky
包提供了类似的/替代实现。有关示例，请参阅我在该问题下的其他地方的答案。很高兴知道有这样一个包。你真的必须在每个变量上运行sticky（），使其属性具有粘性吗？无意冒犯，但是@BenBarnes的解决方案也保留了属性，它一步就处理了整个data.frame（这是我通常需要的）。我很高兴将此添加到粘性软件包中。请参阅：@ctbrown，看起来您解决了该问题。您能否更新上面的解决方案以反映这一点？
> str(mydf)
'data.frame':   100 obs. of  2 variables:
 $ aa: atomic  3 3 4 7 2 7 7 5 5 1 ...
  ..- attr(*, "comment")= chr "Don't drop me!"
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2 2 5 4 2 1 3 5 3 ...
  ..- attr(*, "comment")= chr "Me either!"

> str(mydf[1:2,]) # comment dropped.
'data.frame':   2 obs. of  2 variables:
 $ aa: num  3 3
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2

mydf$aa<-structure(mydf$aa, class="avector")
mydf$bb<-structure(mydf$bb, class=c("avector","factor"))

> str(mydf[1:2,])
'data.frame':   2 obs. of  2 variables:
 $ aa:Class 'avector'  atomic [1:2] 3 3
  .. ..- attr(*, "comment")= chr "Don't drop me!"
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
  ..- attr(*, "comment")= chr "Me either!"

mydf2 <- data.frame( lapply( mydf, function(x) {
  structure( x, class = c("avector", class(x) ) )
} ) )

comment(mydf2)<-comment(mydf)

> str(mydf2[1:2,])
'data.frame':   2 obs. of  2 variables:
 $ aa:Classes 'avector', 'numeric'  atomic [1:2] 3 3
  .. ..- attr(*, "comment")= chr "Don't drop me!"
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
  ..- attr(*, "comment")= chr "Me either!"
 - attr(*, "comment")= chr "I'm a data.frame"

# Define the avector-subselection method (from the manual)
as.data.frame.avector <- as.data.frame.vector
`[.avector` <- function(x,i,...) {
  r <- NextMethod("[")
  mostattributes(r) <- attributes(x)
  r
}

# Assign each column in the data.frame the (additional) class avector
# Note that this will "lose" the data.frame's attributes, therefore write to a copy
df2 <- data.frame(
  lapply(df, function(x) {
    structure( x, class = c("avector", class(x) ) )
  } )
)

# Finally copy the attribute for the original data.frame if necessary
mostattributes(df2) <- attributes(df)

# Now subselects work without losing attributes :)
df2 <- df2[1:100,]
str(df2)

> df <- data.frame( 
+   sticky   = sticky( structure(1:5, comment="sticky attribute") ),
+   nonstick = structure( letters[1:5], comment="non-sticky attribute" )
+ )
> 
> comment(df[1:3, "nonstick"])
NULL
> comment(df[1:3, "sticky"])
[1] "sticky attribute"