Stata 如何获得Var_a和Var_b的Var_c之和?

Stata 如何获得Var_a和Var_b的Var_c之和?,stata,Stata,我试图找到两个变量的和 如果我有以下数据: Name Commodity Amount_cmdt Alex apple 5 Ben orange 10 Chris apple 25 Alex orange 10 Alex apple 10 Chris orange 10 Ben a

我试图找到两个变量的和

如果我有以下数据:

Name   Commodity        Amount_cmdt

Alex       apple           5
Ben        orange          10
Chris      apple           25
Alex       orange          10
Alex       apple           10
Chris      orange          10
Ben        apple            5  
我想要一个最终的数据集,如下所示:

Name   Commodity      Amount_cmdt       total_apple    total_orange

Alex       apple           5                   15              10
Ben        orange          10                  5               10
Chris      apple           25                  25              20
Alex       orange          10                  15              10
Alex       apple           10                  15              10
Chris      orange          10                  25              20
Ben        apple            5                   5              10 
Chris      orange          10                  25              20   
最终,当我有了每个人拥有的苹果和桔子的数量时,我可以丢弃重复的苹果和桔子。但我如何表述这一说法:

如果name=Chris,Commodity=orange,那么total\u orange=sum(Amount\u cmdt)

我写了以下内容,但它是所有苹果或所有橙子的总和,而不考虑名称:

foreach var of varlist Name {
    foreach var of varlist Commodity {
        replace total_apple = sum( Amount_cmdt) if Commodity == "apple"
        replace total_orange = sum( Amount_cmdt) if Commodity == "orange"
    }
}

list

使用您的玩具示例:

clear

input strL(name commodity) amount total_apple total_orange
Alex       apple           5                   15              10
Ben        orange          10                  5               10
Chris      apple           25                  25              20
Alex       orange          10                  15              10
Alex       apple           10                  15              10
Chris      orange          10                  25              20
Ben        apple            5                   5              10 
Chris      orange          10                  25              20 
end
以下是我的作品:

bysort name commodity: egen totals = total(amount)
bysort name (commodity): generate totalapple = totals[1]
bysort name (commodity): generate totalorange = totals[_N]

list name commodity amount total_apple totalapple total_orange totalorange, abbreviate(15)

     +------------------------------------------------------------------------------------+
     |  name   commodity   amount   total_apple   totalapple   total_orange   totalorange |
     |------------------------------------------------------------------------------------|
  1. |  Alex       apple        5            15           15             10            10 |
  2. |  Alex       apple       10            15           15             10            10 |
  3. |  Alex      orange       10            15           15             10            10 |
  4. |   Ben       apple        5             5            5             10            10 |
  5. |   Ben      orange       10             5            5             10            10 |
     |------------------------------------------------------------------------------------|
  6. | Chris       apple       25            25           25             20            20 |
  7. | Chris      orange       10            25           25             20            20 |
  8. | Chris      orange       10            25           25             20            20 |
     +------------------------------------------------------------------------------------+

编辑:

您可以将其概括为两种以上的商品,如下所示:

clear

input strL(name commodity) amount 
Alex       apple           5     
Ben        orange          10                 
Chris      apricot         3
Alex       apricot         4
Ben        apricot         2
Chris      apple           25         
Alex       orange          10              
Alex       apple           10         
Chris      orange          10          
Ben        apple            5             
Chris      apricot         15
Alex       apricot         6
Chris      orange          10                
end

bysort name commodity: egen totals = total(amount)
egen commodities = group(commodity)

levelsof commodity, local(allcommodities) clean
local i 0

foreach var of local allcommodities {
    local ++i
    generate `var' = .
    bysort name (commodity): replace `var' = totals if commodities == `i'
    bysort name (commodity): egen total`var' = min(`var')
    drop `var'
}

drop commodities
修改后的代码段将生成所需的输出:

list name commodity amount total*, abbreviate(15)

     +-------------------------------------------------------------------------------+
     |  name   commodity   amount   totals   totalapple   totalapricot   totalorange |
     |-------------------------------------------------------------------------------|
  1. |  Alex       apple        5       15           15             10            10 |
  2. |  Alex       apple       10       15           15             10            10 |
  3. |  Alex     apricot        6       10           15             10            10 |
  4. |  Alex     apricot        4       10           15             10            10 |
  5. |  Alex      orange       10       10           15             10            10 |
     |-------------------------------------------------------------------------------|
  6. |   Ben       apple        5        5            5              2            10 |
  7. |   Ben     apricot        2        2            5              2            10 |
  8. |   Ben      orange       10       10            5              2            10 |
  9. | Chris       apple       25       25           25             18            20 |
 10. | Chris     apricot        3       18           25             18            20 |
     |-------------------------------------------------------------------------------|
 11. | Chris     apricot       15       18           25             18            20 |
 12. | Chris      orange       10       20           25             18            20 |
 13. | Chris      orange       10       20           25             18            20 |
     +-------------------------------------------------------------------------------+

使用您的玩具示例:

clear

input strL(name commodity) amount total_apple total_orange
Alex       apple           5                   15              10
Ben        orange          10                  5               10
Chris      apple           25                  25              20
Alex       orange          10                  15              10
Alex       apple           10                  15              10
Chris      orange          10                  25              20
Ben        apple            5                   5              10 
Chris      orange          10                  25              20 
end
以下是我的作品:

bysort name commodity: egen totals = total(amount)
bysort name (commodity): generate totalapple = totals[1]
bysort name (commodity): generate totalorange = totals[_N]

list name commodity amount total_apple totalapple total_orange totalorange, abbreviate(15)

     +------------------------------------------------------------------------------------+
     |  name   commodity   amount   total_apple   totalapple   total_orange   totalorange |
     |------------------------------------------------------------------------------------|
  1. |  Alex       apple        5            15           15             10            10 |
  2. |  Alex       apple       10            15           15             10            10 |
  3. |  Alex      orange       10            15           15             10            10 |
  4. |   Ben       apple        5             5            5             10            10 |
  5. |   Ben      orange       10             5            5             10            10 |
     |------------------------------------------------------------------------------------|
  6. | Chris       apple       25            25           25             20            20 |
  7. | Chris      orange       10            25           25             20            20 |
  8. | Chris      orange       10            25           25             20            20 |
     +------------------------------------------------------------------------------------+

编辑:

您可以将其概括为两种以上的商品,如下所示:

clear

input strL(name commodity) amount 
Alex       apple           5     
Ben        orange          10                 
Chris      apricot         3
Alex       apricot         4
Ben        apricot         2
Chris      apple           25         
Alex       orange          10              
Alex       apple           10         
Chris      orange          10          
Ben        apple            5             
Chris      apricot         15
Alex       apricot         6
Chris      orange          10                
end

bysort name commodity: egen totals = total(amount)
egen commodities = group(commodity)

levelsof commodity, local(allcommodities) clean
local i 0

foreach var of local allcommodities {
    local ++i
    generate `var' = .
    bysort name (commodity): replace `var' = totals if commodities == `i'
    bysort name (commodity): egen total`var' = min(`var')
    drop `var'
}

drop commodities
修改后的代码段将生成所需的输出:

list name commodity amount total*, abbreviate(15)

     +-------------------------------------------------------------------------------+
     |  name   commodity   amount   totals   totalapple   totalapricot   totalorange |
     |-------------------------------------------------------------------------------|
  1. |  Alex       apple        5       15           15             10            10 |
  2. |  Alex       apple       10       15           15             10            10 |
  3. |  Alex     apricot        6       10           15             10            10 |
  4. |  Alex     apricot        4       10           15             10            10 |
  5. |  Alex      orange       10       10           15             10            10 |
     |-------------------------------------------------------------------------------|
  6. |   Ben       apple        5        5            5              2            10 |
  7. |   Ben     apricot        2        2            5              2            10 |
  8. |   Ben      orange       10       10            5              2            10 |
  9. | Chris       apple       25       25           25             18            20 |
 10. | Chris     apricot        3       18           25             18            20 |
     |-------------------------------------------------------------------------------|
 11. | Chris     apricot       15       18           25             18            20 |
 12. | Chris      orange       10       20           25             18            20 |
 13. | Chris      orange       10       20           25             18            20 |
     +-------------------------------------------------------------------------------+

@Pearly Spencer向您提供了您所要求的,但详细的代码确实表明,这是一个非常扭曲的数据结构——我预测这将非常难以处理

此外,您不需要重复计算,然后删除重复项,因为您可以直接获得简单的结构

请注意,这会破坏原始数据集,因此保留原始数据集始终是一个好主意。此外,我们不能评论您可能有哪些其他变量。

这两种布局中的一种或两种可能同样或更有用

clear

input str6 (name commodity) amount 
Alex       apple           5      
Ben        orange          10     
Chris      apple           25     
Alex       orange          10     
Alex       apple           10     
Chris      orange          10     
Ben        apple            5     
Chris      orange          10     
end

collapse (sum) amount, by(name commodity) 

list, sepby(name) 

     +---------------------------+
     |  name   commod~y   amount |
     |---------------------------|
  1. |  Alex      apple       15 |
  2. |  Alex     orange       10 |
     |---------------------------|
  3. |   Ben      apple        5 |
  4. |   Ben     orange       10 |
     |---------------------------|
  5. | Chris      apple       25 |
  6. | Chris     orange       20 |
     +---------------------------+

reshape wide amount, i(name) j(commodity) string 

list 

     +-----------------------------+
     |  name   amoun~le   amoun~ge |
     |-----------------------------|
  1. |  Alex         15         10 |
  2. |   Ben          5         10 |
  3. | Chris         25         20 |
     +-----------------------------+

@Pearly Spencer向您提供了您所要求的,但详细的代码确实表明,这是一个非常扭曲的数据结构——我预测这将非常难以处理

此外,您不需要重复计算,然后删除重复项,因为您可以直接获得简单的结构

请注意,这会破坏原始数据集,因此保留原始数据集始终是一个好主意。此外,我们不能评论您可能有哪些其他变量。

这两种布局中的一种或两种可能同样或更有用

clear

input str6 (name commodity) amount 
Alex       apple           5      
Ben        orange          10     
Chris      apple           25     
Alex       orange          10     
Alex       apple           10     
Chris      orange          10     
Ben        apple            5     
Chris      orange          10     
end

collapse (sum) amount, by(name commodity) 

list, sepby(name) 

     +---------------------------+
     |  name   commod~y   amount |
     |---------------------------|
  1. |  Alex      apple       15 |
  2. |  Alex     orange       10 |
     |---------------------------|
  3. |   Ben      apple        5 |
  4. |   Ben     orange       10 |
     |---------------------------|
  5. | Chris      apple       25 |
  6. | Chris     orange       20 |
     +---------------------------+

reshape wide amount, i(name) j(commodity) string 

list 

     +-----------------------------+
     |  name   amoun~le   amoun~ge |
     |-----------------------------|
  1. |  Alex         15         10 |
  2. |   Ben          5         10 |
  3. | Chris         25         20 |
     +-----------------------------+

您可能会混淆
sum()
,它会生成一个累积或运行的总和,并带有一个总计函数。要学习Stata中的循环,请查找介绍并重复其示例。在您的暂定代码中,您似乎认为,
foreach
将在变量的不同值上引发循环,但这根本不是它的工作方式。您的每个循环都是一个项目上的循环,该项目恰好是一个名称,最多执行一次。您可能会混淆
sum()
,它生成一个累积或运行的总和,并带有一个总计函数。要学习Stata中的循环,请查找介绍并重复其示例。在您的暂定代码中,您似乎认为,
foreach
将在变量的不同值上引发循环,但这根本不是它的工作方式。您的每个循环都是一个项目上的循环,它恰好是一个名称,最多执行一次。这很有帮助,我理解其吸引力,但OP将很难处理整个数据集,因为这将被
折叠
破坏。我添加了一个警告。我很想引用你的原则,即我们不能(轻易地)就OP没有告诉我们或没有询问的内容提供建议!这很有帮助,我理解这一诉求,但OP将很难处理整个数据集,因为这将被
崩溃
破坏。我添加了一个警告。我很想引用你的原则,即我们不能(轻易地)就OP没有告诉我们或没有询问的内容提供建议!