根据另一列-R-Data.table中值的变化来减少列值的好方法
我正在寻找一种简单易行的方法,分别根据第1列中每个国家第3列的变化和第2列中每年的变化来减少第4列的值。这应该在R中使用data.table对象完成 数据: 我想要的是这样的东西: 数据:根据另一列-R-Data.table中值的变化来减少列值的好方法,r,loops,math,data.table,R,Loops,Math,Data.table,我正在寻找一种简单易行的方法,分别根据第1列中每个国家第3列的变化和第2列中每年的变化来减少第4列的值。这应该在R中使用data.table对象完成 数据: 我想要的是这样的东西: 数据: 下面是一种使用dplyr和tidyr的方法: library(tidyverse) data %>% separate(1, sep = ",", into = c("country","year","var1",&qu
下面是一种使用
dplyr
和tidyr
的方法:
library(tidyverse)
data %>%
separate(1, sep = ",", into = c("country","year","var1","var2")) %>%
mutate(across(year:var2, as.numeric)) %>%
group_by(country) %>%
mutate(var2 = var2 * ((2*min(var1))-var1)/100)
## A tibble: 6 x 4
## Groups: country [2]
# country year var1 var2
# <chr> <dbl> <dbl> <dbl>
#1 country1 2020 100 1
#2 country1 2025 120 0.8
#3 country1 2030 140 0.6
#4 country2 2020 100 1
#5 country2 2025 150 0.5
#6 country2 2030 180 0.2
data您所说的“减少值”具体是什么意思?您能否使用dput
向我们展示预期的输出并重新格式化您的数据,以便于加载?请看。我包括了我的预期输出。但您没有确切解释这两列之间的关系。例如,为什么我们在最后一行从.5到.42?这只是一个例子。我还不知道确切的关系。我只是在寻找一个好的、方便的方法,在其他值的基础上减少这个值。在这种情况下,精确的参数是不可逆的。理想情况下,有一个解,线性衰减,一个指数衰减,一个随机衰减。这将是完美的,但并非无关紧要。我们建议的代码将实现一种形式的衰减,因此如果您想要一个“完整”的答案,您需要指定一个“完整”的问题。
country1,2020,100,1
country1,2025,120,0.8
country1,2030,140,0.6
country2,2020,100,1
country2,2025,150,0.5
country2,2030,180,0.2
library(tidyverse)
data %>%
separate(1, sep = ",", into = c("country","year","var1","var2")) %>%
mutate(across(year:var2, as.numeric)) %>%
group_by(country) %>%
mutate(var2 = var2 * ((2*min(var1))-var1)/100)
## A tibble: 6 x 4
## Groups: country [2]
# country year var1 var2
# <chr> <dbl> <dbl> <dbl>
#1 country1 2020 100 1
#2 country1 2025 120 0.8
#3 country1 2030 140 0.6
#4 country2 2020 100 1
#5 country2 2025 150 0.5
#6 country2 2030 180 0.2
library(data.table)
setDT(data)
data[, c("country","year","var1","var2") := tstrsplit(V1,",")]
data[,V1 := NULL]
data[,c("year","var1","var2") := lapply(.SD,as.numeric),.SDcol = c("year","var1","var2")]
data[,var2 := .(var2 * (2*min(var1)-var1)/100), by = "country"]
data
# country year var1 var2
#1: country1 2020 100 1.0
#2: country1 2025 120 0.8
#3: country1 2030 140 0.6
#4: country2 2020 100 1.0
#5: country2 2025 150 0.5
#6: country2 2030 180 0.2
data <- structure(list(V1 = c("country1,2020,100,1", "country1,2025,120,1",
"country1,2030,140,1", "country2,2020,100,1", "country2,2025,150,1",
"country2,2030,180,1")), class = "data.frame", row.names = c(NA,
-6L))