Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/linux/22.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何定义包含2个组和1个列的新列_R_Dataframe_Data.table - Fatal编程技术网

R 如何定义包含2个组和1个列的新列

R 如何定义包含2个组和1个列的新列,r,dataframe,data.table,R,Dataframe,Data.table,我有3个专栏: household persons activity 1 1 home 1 1 shopping 1 1 home 1 1 eating 1 1 work 1 1 shopping 1 1 home 1 2 home 1

我有3个专栏:

household   persons   activity
  1       1        home
  1       1         shopping
  1       1        home
  1       1         eating
  1       1         work
  1       1        shopping
  1       1         home
  1       2         home
  1       2          shopping
  1       2         home
  2       1         home
  2       1         eating
  2       1         home
第一列是家庭指数,第二列是家庭成员。每个人的每项活动都从家开始。对于每个家庭中的每个人,我想保护一个列循环,它从1开始,当活动是在家或工作后进行的活动时,将更改为循环+1。例如,在下面的数据中,第三行是home,因此第四行的loop=2,第五行是work,因此下班后的loop=3

输出

household   persons   activity      loop
  1       1        home               1
  1       1         shopping          1 
  1       1        home               1
  1       1         eating            2
  1       1         work              2
  1       1        shopping           3
  1       1         home              3
  1       2         home              1 
  1       2          shopping         1
  1       2         home              1
  2       1         home              1
  2       1         eating            1
  2       1         home              1

这里有一个想法。我们可以使用
rleid
fill
lead
函数来创建
loop

dat2 <- dat %>%
  mutate(activity2 = replace(activity, !activity %in% c("home", "work"), NA)) %>%
  group_by(household, persons) %>%
  fill(activity2) %>%
  mutate(loop = lead(rleid(activity2))) %>%
  fill(loop) %>%
  ungroup() %>%
  select(-activity2)
dat2  
# # A tibble: 13 x 4
#    household persons activity  loop
#        <int>   <int> <chr>    <int>
#  1         1       1 home         1
#  2         1       1 shopping     1
#  3         1       1 home         1
#  4         1       1 eating       2
#  5         1       1 work         2
#  6         1       1 shopping     3
#  7         1       1 home         3
#  8         1       2 home         1
#  9         1       2 shopping     1
# 10         1       2 home         1
# 11         2       1 home         1
# 12         2       1 eating       1
# 13         2       1 home         1
dat2%
mutate(activity2=替换(activity,!activity%在%c(“家庭”、“工作”)中),NA))%>%
组别(住户,人士)%>%
填充(活动2)%>%
突变(循环=先导(rleid(活动2)))%>%
填充(循环)%>%
解组()%>%
选择(-activity2)
dat2
##A tibble:13 x 4
#住户活动循环
#                
#1家1
#2 1购物1
#3 1家1
#4 1吃2
#5 1工作2
#6 1购物3
#7 1家3
#8 1 2家1
#9 1 2购物1
#1012家1
#11 2 1家1
#12 2 1吃1
#13 2 1家1
数据

dat <- read.table(text = "household   persons   activity
  1       1        home
  1       1         shopping
  1       1        home
  1       1         eating
  1       1         work
  1       1        shopping
  1       1         home
  1       2         home
  1       2          shopping
  1       2         home
  2       1         home
  2       1         eating
  2       1         home",
                  stringsAsFactors = FALSE, header = TRUE)

dat这里有一个想法。我们可以使用
rleid
fill
lead
函数来创建
loop

dat2 <- dat %>%
  mutate(activity2 = replace(activity, !activity %in% c("home", "work"), NA)) %>%
  group_by(household, persons) %>%
  fill(activity2) %>%
  mutate(loop = lead(rleid(activity2))) %>%
  fill(loop) %>%
  ungroup() %>%
  select(-activity2)
dat2  
# # A tibble: 13 x 4
#    household persons activity  loop
#        <int>   <int> <chr>    <int>
#  1         1       1 home         1
#  2         1       1 shopping     1
#  3         1       1 home         1
#  4         1       1 eating       2
#  5         1       1 work         2
#  6         1       1 shopping     3
#  7         1       1 home         3
#  8         1       2 home         1
#  9         1       2 shopping     1
# 10         1       2 home         1
# 11         2       1 home         1
# 12         2       1 eating       1
# 13         2       1 home         1
dat2%
mutate(activity2=替换(activity,!activity%在%c(“家庭”、“工作”)中),NA))%>%
组别(住户,人士)%>%
填充(活动2)%>%
突变(循环=先导(rleid(活动2)))%>%
填充(循环)%>%
解组()%>%
选择(-activity2)
dat2
##A tibble:13 x 4
#住户活动循环
#                
#1家1
#2 1购物1
#3 1家1
#4 1吃2
#5 1工作2
#6 1购物3
#7 1家3
#8 1 2家1
#9 1 2购物1
#1012家1
#11 2 1家1
#12 2 1吃1
#13 2 1家1
数据

dat <- read.table(text = "household   persons   activity
  1       1        home
  1       1         shopping
  1       1        home
  1       1         eating
  1       1         work
  1       1        shopping
  1       1         home
  1       2         home
  1       2          shopping
  1       2         home
  2       1         home
  2       1         eating
  2       1         home",
                  stringsAsFactors = FALSE, header = TRUE)
dat假设第一项活动始终在家或工作,则使用另一个选项:

DT[, loop := shift(cumsum(activity %chin% c('home','work')), fill=1L), 
    .(household, persons)]
输出:

    household persons activity loop
 1:         1       1     home    1
 2:         1       1 shopping    1
 3:         1       1     home    1
 4:         1       1   eating    2
 5:         1       1     work    2
 6:         1       1 shopping    3
 7:         1       1     home    3
 8:         1       2     home    1
 9:         1       2 shopping    1
10:         1       2     home    1
11:         2       1     home    1
12:         2       1   eating    1
13:         2       1     home    1
数据:

库(data.table)
DT另一种选择是假设第一项活动总是在家或工作:

DT[, loop := shift(cumsum(activity %chin% c('home','work')), fill=1L), 
    .(household, persons)]
输出:

    household persons activity loop
 1:         1       1     home    1
 2:         1       1 shopping    1
 3:         1       1     home    1
 4:         1       1   eating    2
 5:         1       1     work    2
 6:         1       1 shopping    3
 7:         1       1     home    3
 8:         1       2     home    1
 9:         1       2 shopping    1
10:         1       2     home    1
11:         2       1     home    1
12:         2       1   eating    1
13:         2       1     home    1
数据:

库(data.table)

DT您说过“当活动在家或工作时进行更改。”第六排是购物,而不是在家或工作,因此无需根据您的规则进行更改。它在工作时更改,直到下一次工作或在家,或在第五排我们有工作的人的数据行结束时才会更改,下班后,我们有循环+1直到下一个家或工作我看到我在我的代码中犯了一个错误。在
replace
功能中
eating
应该是
work
。我已经更新了我的答案。现在我知道你在说什么了。请查看我的更新。您说的是“当活动在家或工作时更改”。第六排是购物,而不是在家或工作,因此无需根据您的规则进行更改。它在工作时更改,直到下一次工作或在家,或该人员的数据行结束时才更改。在第五排,我们有工作,下班后,我们有循环+1直到下一个家或工作我看到我在我的代码中犯了一个错误。在
replace
功能中
eating
应该是
work
。我已经更新了我的答案。现在我知道你在说什么了。请看我的更新。