按组划分的频率表,加权数据为R
我想用加权数据分组计算两种频率表 您可以使用以下代码生成可复制的数据:按组划分的频率表,加权数据为R,r,group-by,frequency,weighted,frequency-distribution,R,Group By,Frequency,Weighted,Frequency Distribution,我想用加权数据分组计算两种频率表 您可以使用以下代码生成可复制的数据: Data <- data.frame( country = sample(c("France", "USA", "UK"), 100, replace = TRUE), migrant = sample(c("Native", "Foreign-born"), 100, replace = TRUE), gender = sample (c("men", "women"), 100, re
Data <- data.frame(
country = sample(c("France", "USA", "UK"), 100, replace = TRUE),
migrant = sample(c("Native", "Foreign-born"), 100, replace = TRUE),
gender = sample (c("men", "women"), 100, replace = TRUE),
wgt = sample(100),
year = sample(2006:2007)
)
在我的真实数据库中,我有10年的时间,所以需要花很多时间来应用这些代码。有人知道更快的方法吗
我还希望按国家和年份计算移民身份中的男女比例。我正在寻找类似于:
Var1 Var2 Var3 y2006 y2007
Foreign born France men 52 55
Foreign born France women 48 45
Native France men 51 52
Native France women 49 48
Foreign born UK men 60 65
Foreign born UK women 40 35
Native UK men 48 50
Native UK women 52 50
有人知道我如何得到这些结果吗?你可以这样做:用你已经编写的代码生成一个函数;使用
lappy
在数据中的所有年份中迭代该函数;然后使用Reduce
和merge
将结果列表折叠为一个数据帧。像这样:
# let's make your code into a function called 'tallyho'
tallyho <- function(yr, data) {
require(dplyr)
require(questionr)
DF <- filter(data, year == yr)
result <- with(DF, as.data.frame(cprop(wtd.table(migrant, country, weights = wgt), total = FALSE)))
# rename the last column by year
names(result)[length(names(result))] <- sprintf("y%s", year)
return(result)
}
# now iterate that function over all years in your original data set, then
# use Reduce and merge to collapse the resulting list into a data frame
NewData <- lapply(unique(Data$year), function(x) tallyho(x, Data)) %>%
Reduce(function(...) merge(..., all=T), .)
#让我们将代码生成一个名为“tallyho”的函数
tallyho TIL aboutReduce()
非常感谢@ulfeld的回答,但我遇到了一些麻烦。当我运行代码时,我得到了与2006年和2007年完全相同的结果,这是不正确的……你知道我如何改进它吗?你知道我如何添加关于性别的信息吗?对不起,试试我刚刚发布的编辑版本。我想我给函数输入起了与列相同的名称,这让我很困惑。不幸的是,我不认为可以在这种方法中添加性别,因为wtd.table
只允许双向交叉表。我对这些权重的作用了解不够,无法提出替代方案。
# let's make your code into a function called 'tallyho'
tallyho <- function(yr, data) {
require(dplyr)
require(questionr)
DF <- filter(data, year == yr)
result <- with(DF, as.data.frame(cprop(wtd.table(migrant, country, weights = wgt), total = FALSE)))
# rename the last column by year
names(result)[length(names(result))] <- sprintf("y%s", year)
return(result)
}
# now iterate that function over all years in your original data set, then
# use Reduce and merge to collapse the resulting list into a data frame
NewData <- lapply(unique(Data$year), function(x) tallyho(x, Data)) %>%
Reduce(function(...) merge(..., all=T), .)