Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/65.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 基于两个条件在数据帧子集上循环_R_List_Loops_If Statement_Mutate - Fatal编程技术网

R 基于两个条件在数据帧子集上循环

R 基于两个条件在数据帧子集上循环,r,list,loops,if-statement,mutate,R,List,Loops,If Statement,Mutate,我有以下问题:我需要运行数据帧的每个子集-基于一个变量的值-根据两个条件为另一个变量创建一个新条目 数据框架(dt3)如下:我有4个变量(出生年份、姓氏-姓名、家庭角色-角色和家庭-hh-)。整个集合由hh变量分割或子集,hh变量收集同一家庭下的所有个体。例如,在我下面的示例中,前4行属于家庭“1”。此外,在可变角色下,只说明户主。其余的角色是空的,必须派生,这就是我要做的。我的第一步是分配“孩子”的角色。我想通过在整个数据集和每个子集(每个hh值)上运行循环来实现这一点。如果每一行中有一个人与

我有以下问题:我需要运行数据帧的每个子集-基于一个变量的值-根据两个条件为另一个变量创建一个新条目

数据框架(dt3)如下:我有4个变量(出生年份、姓氏-姓名、家庭角色-角色和家庭-hh-)。整个集合由hh变量分割或子集,hh变量收集同一家庭下的所有个体。例如,在我下面的示例中,前4行属于家庭“1”。此外,在可变角色下,只说明户主。其余的角色是空的,必须派生,这就是我要做的。我的第一步是分配“孩子”的角色。我想通过在整个数据集和每个子集(每个hh值)上运行循环来实现这一点。如果每一行中有一个人与户主姓相同,且出生年份至少比户主晚15年,则该人被推断为“子女”

原始数据帧是:

birth_year       Name           role        hh

1877        Snijders    Head ofhousehold    1
1885        Marteen     NA                  1
1897        Snijders    NA                  1
1892        Zelstra     NA                  1
1878        Kuipers     Head of household   2
1870        Marteen     NA                  2
1897        Wals        NA                  2
1900        Venstra     NA                  2
1900        Lippe       Head of household   3
1905        Flachs      NA                  3
1920        Lippe       NA                  3
1922        Lippe       NA                  3
因此,我需要运行整个集合和每个hh子集,并执行以下两个条件: A.如果此人的姓名==头部的姓名,以及 B如果该人的出生年份与头部年龄相差15年或以上

那么这个人就是“孩子”

到目前为止,我一直在尝试一些事情。当我把户主的角色放在每个家庭的第一排时,我就这样做了:

(a) 嵌套循环,其中我尝试运行数据集,然后运行每个hh。对于每个hh,我运行条件(通过将每行的名称和出生年份与hh第一行的名称和出生年份进行比较,即头-)

还有b),我也尝试过同样的方法,但是使用了列表。我首先通过hh变量拆分dt3

dt3 <- split(dt3, f = dt3$hh)
我所探索的两种解决方案都没有成功,我所期望的结果如下:

birth_year       Name           role        hh

1877        Snijders    Head ofhousehold    1
1885        Marteen     NA                  1
1897        Snijders    children            1
1892        Zelstra     NA                  1
1878        Kuipers     Head of household   2
1870        Marteen     NA                  2
1897        Wals        NA                  2
1900        Venstra     NA                  2
1900        Lippe       Head of household   3
1905        Flachs      NA                  3
1920        Lippe       children            3
1922        Lippe       children            3
欢迎提供任何提示

提前谢谢

您可以先提取所有“HeadOfHousehold”,并将其合并到您的
dt3
中,然后对姓名和出生年份进行比较

dt3 <- read.table(header=T, text="birth_year      Name           role        hh
1877        Snijders    HeadOfHousehold    1
1885        Marteen     NA                  1
1897        Snijders    NA                  1
1892        Zelstra     NA                  1
1878        Kuipers     HeadOfHousehold   2
1870        Marteen     NA                  2
1897        Wals        NA                  2
1900        Venstra     NA                  2
1900        Lippe       HeadOfHousehold   3
1905        Flachs      NA                  3
1920        Lippe       NA                  3
1922        Lippe       NA                  3", as.is = T)


tt <- with(dt3[!is.na(dt3$role) & dt3$role=="HeadOfHousehold",], data.frame(a=birth_year, b=Name, hh))
me <- merge(dt3, tt, all.x=T)
me$role[me$Name==me$b & me$birth_year > me$a+14] <- "children"
me[names(dt3)]

1        1877 Snijders HeadOfHousehold  1
2        1885  Marteen            <NA>  1
3        1897 Snijders        children  1
4        1892  Zelstra            <NA>  1
5        1878  Kuipers HeadOfHousehold  2
6        1870  Marteen            <NA>  2
7        1897     Wals            <NA>  2
8        1900  Venstra            <NA>  2
9        1900    Lippe HeadOfHousehold  3
10       1905   Flachs            <NA>  3
11       1920    Lippe        children  3
12       1922    Lippe        children  3

dt3可能以下速度更快:

您可以先按hh和角色订购=“户主”,将户主角色放在每个家庭的第一排,你已经做了什么,但可能以不同的方式,然后用
ave
per hh测试姓名是否相等,出生年份差异是否超过14

dt3 <- read.table(header=T, text="birth_year      Name           role        hh
1877        Snijders    HeadOfHousehold    1
1885        Marteen     NA                  1
1897        Snijders    NA                  1
1892        Zelstra     NA                  1
1878        Kuipers     HeadOfHousehold   2
1870        Marteen     NA                  2
1897        Wals        NA                  2
1900        Venstra     NA                  2
1900        Lippe       HeadOfHousehold   3
1905        Flachs      NA                  3
1920        Lippe       NA                  3
1922        Lippe       NA                  3", as.is = T)

dt3 <- dt3[with(dt3, order(hh,role!="HeadOfHousehold")),]
dt3$role[with(dt3, as.logical(ave(Name, hh, FUN = function(x) x==x[1])) & ave(birth_year, hh, FUN = function(x) x>(x[1]+14)))] <- "children"
dt3

   birth_year     Name            role hh
1        1877 Snijders HeadOfHousehold  1
2        1885  Marteen            <NA>  1
3        1897 Snijders        children  1
4        1892  Zelstra            <NA>  1
5        1878  Kuipers HeadOfHousehold  2
6        1870  Marteen            <NA>  2
7        1897     Wals            <NA>  2
8        1900  Venstra            <NA>  2
9        1900    Lippe HeadOfHousehold  3
10       1905   Flachs            <NA>  3
11       1920    Lippe        children  3
12       1922    Lippe        children  3

dt3您也可以简单地使用for循环,如:

dt3 <- read.table(header=T, text="birth_year      Name           role        hh
1877        Snijders    HeadOfHousehold    1
1885        Marteen     NA                  1
1897        Snijders    NA                  1
1892        Zelstra     NA                  1
1878        Kuipers     HeadOfHousehold   2
1870        Marteen     NA                  2
1897        Wals        NA                  2
1900        Venstra     NA                  2
1900        Lippe       HeadOfHousehold   3
1905        Flachs      NA                  3
1920        Lippe       NA                  3
1922        Lippe       NA                  3", as.is = T)

dt3 <- dt3[with(dt3, order(hh,role!="HeadOfHousehold")),]

for(i in 1:nrow(dt3)) {
    if(!is.na(dt3$role[i]) & dt3$role[i] == "HeadOfHousehold") {
        hh <- dt3$hh[i]
        Name <- dt3$Name[i]
        birth_year <- dt3$birth_year[i]
    } else {
        if(hh == dt3$hh[i] & Name == dt3$Name[i] & dt3$birth_year[i] > birth_year+14) {dt3$role[i] <- "children"}
    }
}

dt3

   birth_year     Name            role hh
1        1877 Snijders HeadOfHousehold  1
2        1885  Marteen            <NA>  1
3        1897 Snijders        children  1
4        1892  Zelstra            <NA>  1
5        1878  Kuipers HeadOfHousehold  2
6        1870  Marteen            <NA>  2
7        1897     Wals            <NA>  2
8        1900  Venstra            <NA>  2
9        1900    Lippe HeadOfHousehold  3
10       1905   Flachs            <NA>  3
11       1920    Lippe        children  3
12       1922    Lippe        children  3

dt3非常感谢您的时间@user10488504。唯一需要说明的是,我的表有很多行(DT3113000;tt 12400)。。。所以这次合并花了很长时间,非常感谢@user10488504的慷慨,以及第二种方法。而且,它实际上可以做的是由HH合并,因为我只对每个家庭的两个条件感兴趣,所以这样,合并是即时的。
dt3 <- read.table(header=T, text="birth_year      Name           role        hh
1877        Snijders    HeadOfHousehold    1
1885        Marteen     NA                  1
1897        Snijders    NA                  1
1892        Zelstra     NA                  1
1878        Kuipers     HeadOfHousehold   2
1870        Marteen     NA                  2
1897        Wals        NA                  2
1900        Venstra     NA                  2
1900        Lippe       HeadOfHousehold   3
1905        Flachs      NA                  3
1920        Lippe       NA                  3
1922        Lippe       NA                  3", as.is = T)

dt3 <- dt3[with(dt3, order(hh,role!="HeadOfHousehold")),]
dt3$role[with(dt3, as.logical(ave(Name, hh, FUN = function(x) x==x[1])) & ave(birth_year, hh, FUN = function(x) x>(x[1]+14)))] <- "children"
dt3

   birth_year     Name            role hh
1        1877 Snijders HeadOfHousehold  1
2        1885  Marteen            <NA>  1
3        1897 Snijders        children  1
4        1892  Zelstra            <NA>  1
5        1878  Kuipers HeadOfHousehold  2
6        1870  Marteen            <NA>  2
7        1897     Wals            <NA>  2
8        1900  Venstra            <NA>  2
9        1900    Lippe HeadOfHousehold  3
10       1905   Flachs            <NA>  3
11       1920    Lippe        children  3
12       1922    Lippe        children  3
dt3 <- read.table(header=T, text="birth_year      Name           role        hh
1877        Snijders    HeadOfHousehold    1
1885        Marteen     NA                  1
1897        Snijders    NA                  1
1892        Zelstra     NA                  1
1878        Kuipers     HeadOfHousehold   2
1870        Marteen     NA                  2
1897        Wals        NA                  2
1900        Venstra     NA                  2
1900        Lippe       HeadOfHousehold   3
1905        Flachs      NA                  3
1920        Lippe       NA                  3
1922        Lippe       NA                  3", as.is = T)

dt3 <- dt3[with(dt3, order(hh,role!="HeadOfHousehold")),]

for(i in 1:nrow(dt3)) {
    if(!is.na(dt3$role[i]) & dt3$role[i] == "HeadOfHousehold") {
        hh <- dt3$hh[i]
        Name <- dt3$Name[i]
        birth_year <- dt3$birth_year[i]
    } else {
        if(hh == dt3$hh[i] & Name == dt3$Name[i] & dt3$birth_year[i] > birth_year+14) {dt3$role[i] <- "children"}
    }
}

dt3

   birth_year     Name            role hh
1        1877 Snijders HeadOfHousehold  1
2        1885  Marteen            <NA>  1
3        1897 Snijders        children  1
4        1892  Zelstra            <NA>  1
5        1878  Kuipers HeadOfHousehold  2
6        1870  Marteen            <NA>  2
7        1897     Wals            <NA>  2
8        1900  Venstra            <NA>  2
9        1900    Lippe HeadOfHousehold  3
10       1905   Flachs            <NA>  3
11       1920    Lippe        children  3
12       1922    Lippe        children  3