Stata-从长到宽重塑数据以进行分析_Stata_Reshape

Stata-从长到宽重塑数据以进行分析

stata

Stata-从长到宽重塑数据以进行分析,stata,reshape,Stata,Reshape,我有以下表格中的数据 firm month_year sales competitor competitor_location competitor_branch_1 competitor_branch_2 1 1_2014 25 XYZ US EEE RRR 1 2_2014 21

我有以下表格中的数据

firm    month_year    sales    competitor   competitor_location      competitor_branch_1       competitor_branch_2
  1       1_2014    25          XYZ             US                      EEE                       RRR
  1       2_2014    21          XYZ             US                      FFF
  1       2_2014    21          ABC             UK                      GGG
 ...
  21     1_2009    11          LKS            UK                       AAA
  21     1_2009    11          AIS            UK                       BBB
  21     1_2009    11          AJS            US                       CCC
  21     2_2009    12          LKS            UK                       AAA

我仍然希望每个公司都有一个月\年级别的条目，但不希望其他变量有单独的行，只需要列。我正试图把它转换成这种格式

firm    month_year    sales    competitor_1   competitor_2     competitor_3        competitor_1_location     competitor_2_location     competitor_3_location            competitor_1_branch_1        competitor_2_branch_1           competitor_3_branch_1           competitor_1_branch_2        competitor_2_branch_2           competitor_3_branch_2                  competitor_1_branch_3        competitor_2_branch_3           competitor_3_branch_3

我认为

重塑广泛的销售竞争对手竞争对手地理位置竞争对手分公司1竞争对手分公司2，I（公司）j（月\年）

大多数代码只是设置示例数据（无论效率如何低下）。我认为

encode

s不是必需的，但推荐使用

该准则对每家公司只给出一个观察结果（如我在评论中所述）

就你所知，这是不可能的。您的

变量必须能够唯一标识

公司

组内的观察结果。例如，公司21在同一日期有多个观察结果，因此

重塑

不起作用。您的示例意味着您试图用多个观测值填充数据矩阵的一个“单元”。Stata不会接受这一点。因此，进行“手动”进行

重塑的练习，您将看到困难所在。您没有按所需格式输入任何值。这次演习正是为了这个目的。更简单的是，您可以使用帮助重塑
@RobertoFerrer中的非常小的数据库进行练习谢谢。我已经用更简单的例子成功地完成了重塑。对于这种情况，您有什么建议？问题是您所说的重塑是不可能的。你可以这样做，每个公司只有一个观察结果（不是确定日期）。但是，数据的布局及其有用性是您必须根据目标来确定的。我将用我提到的内容发布一个示例。通常，我会使用长格式的数据，除非估算方法迫使我使用另一种方法。
clear all
set more off

*----- example data -----

input ///
firm    str7 month_year    sales    str3 competitor   str3 competitor_location  str3 competitor_branch_1       str3 competitor_branch_2
  1       "1_2014"    25          "XYZ"            "US"                     "EEE"                      "RRR"
  1       "2_2014"    21          "XYZ"            "US"                    "FFF"
  1       "2_2014"    21          "ABC"            "UK"                     "GGG"
  21     "1_2009"    11          "LKS"           "UK"                      "AAA"
  21     "1_2009"    11          "AIS"            "UK"                      "BBB"
  21     "1_2009"    11          "AJS"            "US"                      "CCC"
  21     "2_2009"    12          "LKS"            "UK"                      "AAA"
end

encode competitor, gen(comp)
encode competitor_location, gen(comploc)
encode competitor_branch_1, gen(compbr1)
encode competitor_branch_1, gen(compbr2)

gen date = ym( real(substr(month_year,3,.)), real(substr(month_year,1,1)) )
format date %tm

drop competitor* month*

list

*----- what you want ?? -----

bysort firm: gen j = _n // this sorting is not unique

reshape wide date sales comp comploc compbr1 compbr2, i(firm) j(j)