在R中的data.frame中标记两个参数上唯一组合的行_R_Dataframe_Label_Unique_Multiple Columns

在R中的data.frame中标记两个参数上唯一组合的行

r dataframe

在R中的data.frame中标记两个参数上唯一组合的行,r,dataframe,label,unique,multiple-columns,R,Dataframe,Label,Unique,Multiple Columns,在包含两个参数（日期和桩号）信息的data.frame中，我想标记在新列中，这两个元素的每个唯一组合我所拥有的： df date station 1 april GF3 2 december GF1 3 april GF2 4 april GF3 5 december GF1 我想要的是： df2 date station Label 1 april G

在包含两个参数（日期和桩号）信息的data.frame中，我想标记在新列中，这两个元素的每个唯一组合

我所拥有的：

df
       date station  
1    april     GF3   
2 december     GF1    
3    april     GF2   
4    april     GF3     
5 december     GF1

我想要的是：

df2
       date station   Label
1    april     GF3      1
2 december     GF1      2 
3    april     GF2      3
4    april     GF3      1  
5 december     GF1      2

谢谢

将这些值粘贴在一起，并使用

匹配

唯一

创建唯一的组号

vals <- paste(df$date, df$station)
df$label <- match(vals, unique(vals))

#      date station label
#1    april     GF3     1
#2 december     GF1     2
#3    april     GF2     3
#4    april     GF3     1
#5 december     GF1     2

将这些值粘贴在一起，并使用

match

unique

创建唯一的组号

vals <- paste(df$date, df$station)
df$label <- match(vals, unique(vals))

#      date station label
#1    april     GF3     1
#2 december     GF1     2
#3    april     GF2     3
#4    april     GF3     1
#5 december     GF1     2

dplyr

带

左连接的进近

：

d <- tribble(~date, ~station,
             "april","GF3",
             "december","GF1",    
             "april","GF2",   
             "april","GF3", 
             "december","GF1")

d %>% left_join(
  d %>% distinct(date, station) %>% 
    rowid_to_column(),
  by = c("station", "date")
)

d%left\u加入(
d%%>%不同（日期、电台）%%>%
rowid_到_列（），
by=c（“车站”、“日期”）
)

其结果是：

  date     station rowid
  <chr>    <chr>   <int>
1 april    GF3         1
2 december GF1         2
3 april    GF2         3
4 april    GF3         1
5 december GF1         2

日期站rowid
4月1日GF3 1
二零零一年十二月二日
4月3日GF23
4月4日GF3 1
2012年12月5日

dplyr

采用

左联合进近

：

d <- tribble(~date, ~station,
             "april","GF3",
             "december","GF1",    
             "april","GF2",   
             "april","GF3", 
             "december","GF1")

d %>% left_join(
  d %>% distinct(date, station) %>% 
    rowid_to_column(),
  by = c("station", "date")
)

d%left\u加入(
d%%>%不同（日期、电台）%%>%
rowid_到_列（），
by=c（“车站”、“日期”）
)

其结果是：

  date     station rowid
  <chr>    <chr>   <int>
1 april    GF3         1
2 december GF1         2
3 april    GF2         3
4 april    GF3         1
5 december GF1         2

日期站rowid
4月1日GF3 1
二零零一年十二月二日
4月3日GF23
4月4日GF3 1
2012年12月5日

密集等级

也可以

df %>% mutate(Label = dense_rank(paste(date, station)))

      date station Label
1    april     GF3     2
2 december     GF1     3
3    april     GF2     1
4    april     GF3     2
5 december     GF1     3

然而，它将优先考虑按字母顺序排列的数字。A

也可以
df %>% mutate(Label = dense_rank(paste(date, station)))

      date station Label
1    april     GF3     2
2 december     GF1     3
3    april     GF2     1
4    april     GF3     2
5 december     GF1     3

但是，它会优先选择按字母顺序排列的数字
请让其他人更容易获取您的数据。请参阅，让其他人更容易获取您的数据。看见