dplyr：使用不同的列名应用计算_R_Dplyr

dplyr：使用不同的列名应用计算

dplyr：使用不同的列名应用计算,r,dplyr,R,Dplyr,我试图创建以下公式： Interest expense / (Total Debt(for all years)) / # number of years 数据如下所示： GE2017 GE2016 GE2015 GE2014 Interest Expense -2753000 -2026000 -1706000 -1579000 Long Term Debt

我试图创建以下公式：

Interest expense / (Total Debt(for all years)) / # number of years

数据如下所示：

                                GE2017    GE2016    GE2015    GE2014
Interest Expense              -2753000  -2026000  -1706000  -1579000
Long Term Debt               108575000 105080000 144659000 186596000
Short/Current Long Term Debt 134591000 136211000 197602000 261424000
Total_Debt                   243166000 241291000 342261000 448020000
                             GOOG2017 GOOG2016 GOOG2015 GOOG2014
Interest Expense              -109000  -124000  -104000  -101000
Long Term Debt                3943000  3935000  1995000  2992000
Short/Current Long Term Debt  3969000  3935000  7648000  8015000
Total_Debt                    7912000  7870000  9643000 11007000
                             NVDA2018 NVDA2017 NVDA2016 NVDA2015
Interest Expense               -61000   -58000   -47000   -46000
Long Term Debt                1985000  1985000     7000  1384000
Short/Current Long Term Debt  2000000  2791000  1434000  1398000
Total_Debt                    3985000  4776000  1441000  2782000

也就是说，对于

GE

，我试图将最近一年的利息费用

-2753000

除以

GE

所有四年的

总债务的平均值
所以,
-2753000/平均值（243166000+241291000+342261000+448020000）=0.0086

然而，在计算平均值时，我遇到了groupby（）
的问题，因为GE
和其他公司因年份不同而有不同的列名
    cost_of_debt %>%
      t() %>%
      data.frame() %>%
      rownames_to_column('rn') %>%
      group_by(rn)
#Calcualtion here

第二；如果可能的话，我想做与上面相同的计算，但只使用每个公司的最后两年
-2753000/平均值（243166000+241291000）=0.01136

这里可能有一个grepl
函数工作吗
我有一个向量叫做符号

symbols对于第一种情况，在将行名创建为列（rownames\u to\u column
-fromtibble
）后，通过在“年”开始和公司结束之间的连接处拆分，将其分隔为“公司”和“年”，名称按“公司”分组，通过将“利息.费用”的比例与“总债务”的平均值一起创建一个“新”列。然后，我们可以按“年”排列，得到每个“公司”最后两个“总债务”的平均值，并除以“利息、费用”
library(dplyr)
 cost_of_debt %>%
  t() %>%
  data.frame() %>%
  rownames_to_column('rn')  %>%
  separate(rn, into = c("firm", "year"),
          "(?<=[A-Z])(?=[0-9])", convert = TRUE) %>%
  group_by(firm) %>%
  mutate(New = Interest.Expense/mean(Total_Debt)) %>%
  arrange(firm, year) %>%
  mutate(NewLast = Interest.Expense/mean(tail(Total_Debt, 2)))

库（dplyr）
债务成本%>%
t（）%>%
data.frame（）%>%
行名到列（'rn'）%>%
单独（rn，分为=c（“公司”、“年度”），
“（？我认为您需要首先清理数据，以便更容易理解什么是观察值，什么是变量。Google tidy data:）这是我的解决方案。首先，我要整理数据，然后计算简单明了
library(tidyverse)
library(stringr)
                                                                                                                                                                                                                                                                                                                                                                                                                                                               ), class = "data.frame")
# Clean and make the data tidy
cost_of_debt <- cost_of_debt %>% 
  as_tibble() %>% 
  rownames_to_column(var = "indicator") %>% 
  mutate(indicator = str_replace_all(indicator, regex("\\s|\\/"), "_")) %>% 
  gather(k, value, -indicator) %>% 
  separate(k, into = c("company", "year"), -4) %>% 
  spread(indicator, value) %>% 
  rename_all(tolower)

对于第一种情况，您不需要按i.le.cost\u of_debt%%>%+t（）%%>%+data.frame（）%%>%+rownames\u to_column（'rn'）%%>%进行变异（新=利息.费用/总债务）
太好了！谢谢！我只对每家公司的最后期限感兴趣，所以2017
对于GE
和GOOG
和2018
对于NVDA
。我将把最后期限提取到一个新的“df”中。再次感谢！@user113156您可以进行筛选或只是总结
library(tidyverse)
library(stringr)
                                                                                                                                                                                                                                                                                                                                                                                                                                                               ), class = "data.frame")
# Clean and make the data tidy
cost_of_debt <- cost_of_debt %>% 
  as_tibble() %>% 
  rownames_to_column(var = "indicator") %>% 
  mutate(indicator = str_replace_all(indicator, regex("\\s|\\/"), "_")) %>% 
  gather(k, value, -indicator) %>% 
  separate(k, into = c("company", "year"), -4) %>% 
  spread(indicator, value) %>% 
  rename_all(tolower)

   company year  interest_expense long_term_debt short_current_long_term_debt total_debt
   <chr>   <chr>            <dbl>          <dbl>                        <dbl>      <dbl>
 1 GE      2014          -1579000      186596000                    261424000  448020000
 2 GE      2015          -1706000      144659000                    197602000  342261000
 3 GE      2016          -2026000      105080000                    136211000  241291000
 4 GE      2017          -2753000      108575000                    134591000  243166000
 5 GOOG    2014           -101000        2992000                      8015000   11007000

cost_of_debt <- cost_of_debt %>%
  group_by(company) %>% 
  mutate(int_over_totdept4 = interest_expense / mean(total_debt),
         int_over_totdept2 = interest_expense / mean(total_debt[year %in% c("2017", "2016")]))

       company year  interest_expense long_term_debt short_current_long_term_debt total_debt int_over_totdept4 int_over_totdept2
   <chr>   <chr>            <dbl>          <dbl>                        <dbl>      <dbl>             <dbl>             <dbl>
 1 GE      2014          -1579000      186596000                    261424000  448020000          -0.00495          -0.00652
 2 GE      2015          -1706000      144659000                    197602000  342261000          -0.00535          -0.00704
 3 GE      2016          -2026000      105080000                    136211000  241291000          -0.00636          -0.00836
 4 GE      2017          -2753000      108575000                    134591000  243166000          -0.00864          -0.0114 
 5 GOOG    2014           -101000        2992000                      8015000   11007000          -0.0111           -0.0128 

# First question:
cost_of_debt %>% filter(company == "GE", year == "2017") %>% select(company, year, int_over_totdept4)

# Second question:
cost_of_debt %>% filter(year == "2017") %>% select(company, year, int_over_totdept2)