R 分组比例表

R 分组比例表,r,R,我的data.frame结构如下: location gender 15.19 20.30 31.40 41.64 65. New York Female 2 41 13 19 1 New York Male 1 23 15

我的data.frame结构如下:

location               gender        15.19     20.30      31.40      41.64      65.
New York                Female          2         41         13         19        1
New York                  Male          1         23         15         17        2
San Francisco           Female          1         27         14         14        3
San Francisco             Male          4         24         14         10        1
Mexico City             Female          1         40         26         11        3
Mexico City               Male          4         23         35          8        3
Paris                   Female          2         12         10          6        0
Paris                     Male          1         20         13         11        1
…并需要将其转换为一个比例表,每个单元格表示其给定城市的两行比例。这是一个解决方案,但是否有更简单的方法来处理多个列(只需转换它们而不生成新列?)

编辑 正确的输出将给出每个单元格在该城市所有单元格中所占的比例,以便共享位置“纽约”的所有单元格加起来等于1,共享位置“旧金山”的所有单元格加起来等于1。例如:

 location             gender        15.19     20.30      31.40      41.64        65.
 New York             Female          .01       .31        .1         .14        .01
 New York               Male          .01       .17       .11         .13        .02
然后性别列可以与
a1

do.call(cbind, list(gender = df$gender, a1))
数据:

dput(df)
structure(list(location = c("New York", "New York", "San Francisco", 
"San Francisco", "Mexico City", "Mexico City", "Paris", "Paris"
), gender = c("Female", "Male", "Female", "Male", "Female", "Male", 
"Female", "Male"), X15.19 = c(2L, 1L, 1L, 4L, 1L, 4L, 2L, 1L), 
    X20.30 = c(41L, 23L, 27L, 24L, 40L, 23L, 12L, 20L), X31.40 = c(13L, 
    15L, 14L, 14L, 26L, 35L, 10L, 13L), X41.64 = c(19L, 17L, 
    14L, 10L, 11L, 8L, 6L, 11L), X65. = c(1L, 2L, 3L, 1L, 3L, 
    3L, 0L, 1L)), .Names = c("location", "gender", "X15.19", 
"X20.30", "X31.40", "X41.64", "X65."), row.names = c(NA, -8L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000000000200788>)

请给出所需输出的示例好吗?您可以改为使用:
setDT(df)[,(selected_cols):=prop.table(.SD),by=location,.SDcols=selected_cols]
do.call(cbind, list(gender = df$gender, a1))
dput(df)
structure(list(location = c("New York", "New York", "San Francisco", 
"San Francisco", "Mexico City", "Mexico City", "Paris", "Paris"
), gender = c("Female", "Male", "Female", "Male", "Female", "Male", 
"Female", "Male"), X15.19 = c(2L, 1L, 1L, 4L, 1L, 4L, 2L, 1L), 
    X20.30 = c(41L, 23L, 27L, 24L, 40L, 23L, 12L, 20L), X31.40 = c(13L, 
    15L, 14L, 14L, 26L, 35L, 10L, 13L), X41.64 = c(19L, 17L, 
    14L, 10L, 11L, 8L, 6L, 11L), X65. = c(1L, 2L, 3L, 1L, 3L, 
    3L, 0L, 1L)), .Names = c("location", "gender", "X15.19", 
"X20.30", "X31.40", "X41.64", "X65."), row.names = c(NA, -8L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000000000200788>)
setDT(df)[, (selected_cols) := prop.table(.SD), by = location, .SDcols = selected_cols]