R 根据另一个变量区分ID
在本例中,如果列ID在其他变量艺术中不同,我需要用字母来区分它们。像这样:R 根据另一个变量区分ID,r,string,R,String,在本例中,如果列ID在其他变量艺术中不同,我需要用字母来区分它们。像这样: Id<-c("RoLu1976", "RoLu1976", "AlBlKyFy1989", "ThSa1996", "AlBlKyFy1989","ThSa1996") Art<-c("Econometric Policy Evaluation", "Policy Right", "Rules", "Expectations", "Nonneutrality of money","Expectations")
Id<-c("RoLu1976", "RoLu1976", "AlBlKyFy1989", "ThSa1996", "AlBlKyFy1989","ThSa1996")
Art<-c("Econometric Policy Evaluation", "Policy Right", "Rules", "Expectations", "Nonneutrality of money","Expectations")
Yr<-c(1976, 1976, 1989, 1996, 1989, 1996)
df<-data.frame(Id,Art,Yr)
在这种情况下,列ID在某些情况下与RoLu1976相同,但在艺术列中不同。使用dplyr包:
使用dplyr包:
使用dplyr:
这可以缩短,但我希望它能让阅读变得非常清晰
# A tibble: 7 x 3
Id Art Yr
<chr> <fctr> <dbl>
1 RoLu1976a Econometric Policy Evaluation 1976
2 RoLu1976b Policy Right 1976
3 AlBlKyFy1989a Rules 1989
4 ThSa1996 Expectations 1996
5 AlBlKyFy1989b Nonneutrality of money 1989
6 ThSa1996 Expectations 1996
使用dplyr:
这可以缩短,但我希望它能让阅读变得非常清晰
# A tibble: 7 x 3
Id Art Yr
<chr> <fctr> <dbl>
1 RoLu1976a Econometric Policy Evaluation 1976
2 RoLu1976b Policy Right 1976
3 AlBlKyFy1989a Rules 1989
4 ThSa1996 Expectations 1996
5 AlBlKyFy1989b Nonneutrality of money 1989
6 ThSa1996 Expectations 1996
数据表解决方案
数据表解决方案
使用for循环:
df$Id <- as.character(df$Id)
# loop through Ids
for(id in unique(df$Id)){
sub <- unique(df[df$Id == id,])
# check if this Id needs to be manipulated
if(nrow(sub) > 1){
# assign unique Ids
for(j in 1:nrow(sub)){
sub[j,1] <- paste0(sub[j,1],letters[j])
}
# replace old Ids with new Ids
df[df$Id == id, ] <- sub
}
}
使用for循环:
df$Id <- as.character(df$Id)
# loop through Ids
for(id in unique(df$Id)){
sub <- unique(df[df$Id == id,])
# check if this Id needs to be manipulated
if(nrow(sub) > 1){
# assign unique Ids
for(j in 1:nrow(sub)){
sub[j,1] <- paste0(sub[j,1],letters[j])
}
# replace old Ids with new Ids
df[df$Id == id, ] <- sub
}
}
ID RoLu1976和AlBlKyFi1989在艺术栏中有不同的价值。它们应该有不同的字母,而不是相同的。ID RoLu1976和AlBlKyFi1989在艺术栏中有不同的价值。它们应该有不同的字母,而不是相同的。ID ThSa1996在Art列中具有相同的值。它不应该有信。也许我不明白你想要什么。ThSa1996的两行不应该有相同的字母吗,因为它们都有相同的艺术价值?因为ThSa1996在艺术上有相同的价值,所以不应该用字母标记。它应该是相同的,即ThSa1996。这就是我要找的。ID ThSa1996在艺术栏中有相同的价值。它不应该有信。也许我不明白你想要什么。ThSa1996的两行不应该有相同的字母吗,因为它们都有相同的艺术价值?因为ThSa1996在艺术上有相同的价值,所以不应该用字母标记。它应该是相同的,即ThSa1996。这就是我要找的。
# A tibble: 7 x 3
Id Art Yr
<chr> <fctr> <dbl>
1 RoLu1976a Econometric Policy Evaluation 1976
2 RoLu1976b Policy Right 1976
3 AlBlKyFy1989a Rules 1989
4 ThSa1996 Expectations 1996
5 AlBlKyFy1989b Nonneutrality of money 1989
6 ThSa1996 Expectations 1996
library(data.table)
setDT(df)
df[, tmp := seq(uniqueN(Art)), by = Id]
df[, addition := ifelse(.N>1, "",letters[tmp]), by = .(Id, Art)]
df[, Id := paste0(Id, addition)]
df[, c("tmp", "addition") := NULL]
df$Id <- as.character(df$Id)
# loop through Ids
for(id in unique(df$Id)){
sub <- unique(df[df$Id == id,])
# check if this Id needs to be manipulated
if(nrow(sub) > 1){
# assign unique Ids
for(j in 1:nrow(sub)){
sub[j,1] <- paste0(sub[j,1],letters[j])
}
# replace old Ids with new Ids
df[df$Id == id, ] <- sub
}
}