R数据表“；“无通用类型”；_R_Data.table_Bigdata

R数据表“；“无通用类型”；

R数据表“；“无通用类型”；,r,data.table,bigdata,R,Data.table,Bigdata,我试图在一个大数据集上使用data.table，但为了测试，我创建了一个子datatable（只有几行）。当我在我的子数据表上运行代码时，一切都正常工作，但当我尝试在我的大数据集上运行代码时，一切都不正常。编辑：我已经将子数据表放大了，现在它给出了与以前相同的错误一般来说，我创建了一个自定义函数，它根据另一列计算值 all_poss[,scores:=list(custom_func(alloc_opt)),by=1:nrow(all_poss)] 如前所述，我创建了一个名为testdf的小

我试图在一个大数据集上使用data.table，但为了测试，我创建了一个子datatable（只有几行）。当我在我的子数据表上运行代码时，一切都正常工作，但当我尝试在我的大数据集上运行代码时，一切都不正常。编辑：我已经将子数据表放大了，现在它给出了与以前相同的错误

一般来说，我创建了一个自定义函数，它根据另一列计算值

all_poss[,scores:=list(custom_func(alloc_opt)),by=1:nrow(all_poss)]

如前所述，我创建了一个名为

testdf

的小“子数据表”，其中包含

all_poss

的前两行。当我运行命令时

testdf[,scores:=list(custom_func(alloc_opt)),by=1:nrow(testdf)]

…函数运行良好，我得到了预期的输出列。但是，当我在

all_poss

上运行此操作时，会出现以下错误：

 Error: No common type for `VANGUARD.TARGET.2050` <double> and `scores` <character>.
Run `rlang::last_error()` to see where the error occurred.

自定义函数

custom_func <- function(alloc_opt){
  
  #THIS IS HOW YOU CALL VARIABLES IN DATATABLES
  opt <- as.numeric(as.character(alloc_opt))
  print(paste0("allocation_option_",opt))
  
  #get the different allocations for the option
  tempdf <- testdf[testdf$alloc_opt==opt]
  tempdf <- tempdf %>% pivot_longer(cols=-alloc_opt,names_to="investment",values_to="allocation")

  #incorporate the different metrics and values into tempdf (make sure items are formatted correctly)
  tempdf$investment <- gsub(" ",".",tempdf$investment)
  tempdf <- left_join(tempdf,investments_long,by=c("investment"="ID"))

  #incorporate the max values from "norm" into tempdf
  tempdf <- left_join(tempdf,norm,by=c("metric"="metric"))

  #incorporate weights from "weightings" into tempdf
  tempdf <- left_join(tempdf,weightings,by=c("metric"="metric"))

  #calculate nwa subscore for each investment/metric in tempdf
  tempdf <- tempdf %>%
    dplyr::mutate(nwa_subscore=(value/max)*(allocation/100)*weight)

  #calculate the nwa_score for each metric in tempdf
  tempdf2 <- tempdf %>%
   dplyr::group_by(metric) %>%
   dplyr::summarize(nwa_score=sum(nwa_subscore)) %>%
   dplyr::mutate(nwa_score=round(nwa_score,digits=4))

  nwa <- tempdf2$nwa_score
  nwa <- nwa[!is.na(nwa)]
  nwa <- paste(nwa,collapse="_")

  #calculate nwao_score
  nwao <- sum(tempdf$nwa_subscore,na.rm=TRUE)
  nwao <- round(nwao,digits=4)
  
  #combine nwa & nwao into one string
  all <- paste(nwa,nwao,sep="_")
  print(all)
  
  #return all the nwa and nwao scores as _-separated values
  return(as.character(all))
}

输出

Error: No common type for `VANGUARD.TARGET.2050` <double> and `scores` <character>.
Run `rlang::last_error()` to see where the error occurred.

错误：“VANGUARD.TARGET.2050”和“scores”没有通用类型。
运行`rlang:：last_error（）`查看错误发生的位置。

如果没有示例数据，就无法真正提供可操作的帮助，但是

数据。table

（坦率地说，最健壮的函数）不会默默地将数据从一个类强制到另一个类。使用

apply

执行类似操作的原因很可能是因为它会默默地将某些内容从

numeric

强制到

character

。如果这种行为是故意的，那么就事先做。如果没有，那么这个错误就救了你。我肯定可以添加数据集，我犹豫了，因为原始数据集很大（>1M行），而我由此创建的小数据集（仅前2行数据），在运行此函数时工作正常。是否将幕后操作强制到小型数据集，而不是大型数据集？如果有帮助，在错误消息中，“VANGUARD.TARGET.2050”是data.table中最左边的列，“scores”是我试图创建的新列。我被错误信息弄糊涂了；我是否需要将“VANGUARD.TARGET.2050”更改为“character”才能运行该函数？因此，大多数问题不需要超过5-20行（4-10列）的数据才能理解这一点。为了回答您对

data.table

的担忧，我不知道它的任何固有特性会意外导致这种情况。我的猜测是，

custom_func

是不确定的：有时它返回

numeric

，有时它返回

character

。这可能是您需要调查的问题。。。如果你把它贴在这里，我们可能会帮你。谢谢你的建议。我编辑了这篇文章，以包含您复制所需的数据。通常，该函数获取一行数据，然后从其他引用表中提取以计算一些值（nwa和nwao分数），然后将它们连接在一个大字符中（其中每个数字用“389;”分隔），并返回该字符串。

> dput(norm)
structure(list(metric = c("X10yrOrLOF_return", "exp_ratio", "mgsr_ratings_out_of_5", 
"mgsr_stdev", "intl_exp"), max = c(0.1719, 0.01185, 5, 20.21, 
0.99)), row.names = c(NA, -5L), class = "data.frame")

> dput(weightings)
structure(list(metric = c("X10yrOrLOF_return", "exp_ratio", "mgsr_ratings_out_of_5", 
"mgsr_stdev", "intl_exp"), weight = c(1, 0.5, 0.5, -0.5, 0.3)), class = "data.frame", row.names = c(NA, 
-5L))

custom_func <- function(alloc_opt){
  
  #THIS IS HOW YOU CALL VARIABLES IN DATATABLES
  opt <- as.numeric(as.character(alloc_opt))
  print(paste0("allocation_option_",opt))
  
  #get the different allocations for the option
  tempdf <- testdf[testdf$alloc_opt==opt]
  tempdf <- tempdf %>% pivot_longer(cols=-alloc_opt,names_to="investment",values_to="allocation")

  #incorporate the different metrics and values into tempdf (make sure items are formatted correctly)
  tempdf$investment <- gsub(" ",".",tempdf$investment)
  tempdf <- left_join(tempdf,investments_long,by=c("investment"="ID"))

  #incorporate the max values from "norm" into tempdf
  tempdf <- left_join(tempdf,norm,by=c("metric"="metric"))

  #incorporate weights from "weightings" into tempdf
  tempdf <- left_join(tempdf,weightings,by=c("metric"="metric"))

  #calculate nwa subscore for each investment/metric in tempdf
  tempdf <- tempdf %>%
    dplyr::mutate(nwa_subscore=(value/max)*(allocation/100)*weight)

  #calculate the nwa_score for each metric in tempdf
  tempdf2 <- tempdf %>%
   dplyr::group_by(metric) %>%
   dplyr::summarize(nwa_score=sum(nwa_subscore)) %>%
   dplyr::mutate(nwa_score=round(nwa_score,digits=4))

  nwa <- tempdf2$nwa_score
  nwa <- nwa[!is.na(nwa)]
  nwa <- paste(nwa,collapse="_")

  #calculate nwao_score
  nwao <- sum(tempdf$nwa_subscore,na.rm=TRUE)
  nwao <- round(nwao,digits=4)
  
  #combine nwa & nwao into one string
  all <- paste(nwa,nwao,sep="_")
  print(all)
  
  #return all the nwa and nwao scores as _-separated values
  return(as.character(all))
}

testdf[,scores:=custom_func(alloc_opt),by=1:nrow(testdf)]

Error: No common type for `VANGUARD.TARGET.2050` <double> and `scores` <character>.
Run `rlang::last_error()` to see where the error occurred.