给定配偶ID，如何将配偶数据附加到主题'；Stata中的数据行？_Stata

给定配偶ID，如何将配偶数据附加到主题'；Stata中的数据行？

stata

给定配偶ID，如何将配偶数据附加到主题'；Stata中的数据行？,stata,Stata,我正在使用一个调查数据集，其中每一行都是一个主题/观察。在其中一个数据列中，我有一个带有配偶ID的条目。我想将配偶的教育水平（即配偶行中的数据条目）添加到另一个配偶的观察值中——关于如何在Stata中这样做的建议我发现当比较其他策略时，合并，是相当有效的；因此，马丁的答案可能是最好的选择。但这里我展示了另一种方法：循环观察（顺便说一句，在不到一周的时间里，这个问题或其变体在statalist.org和StackOverflow之间出现了至少三次） // create some example

我正在使用一个调查数据集，其中每一行都是一个主题/观察。在其中一个数据列中，我有一个带有配偶ID的条目。我想将配偶的教育水平（即配偶行中的数据条目）添加到另一个配偶的观察值中——关于如何在Stata中这样做的建议

我发现当比较其他策略时，

合并

，是相当有效的；因此，马丁的答案可能是最好的选择。但这里我展示了另一种方法：循环观察（顺便说一句，在不到一周的时间里，这个问题或其变体在statalist.org和StackOverflow之间出现了至少三次）

// create some example data
clear
input id spid educ
      1  2    6
      2  1    12 
      3  6    10
      4  5    13
      5  4    6
end

// create temporary file
tempfile original_data
save `original_data'

// prepare new file for merging
drop id
rename spid id
rename educ speduc

// merge
merge 1:1 id using `original_data'

// admire the result
list

// the master file is actually the file you created
// so you probably don't want to observations who only come from master
// _merge == 1
drop if _merge == 1

@马腾的解决方案真的很简洁高效。这里有一个不使用

merge

的替代方法。好处是它为每对夫妇创建了一个ID变量

clear
// create some data and include cases without matches
input ID SPO_ID EDU
1 5 12
2 4 16
3 . 12
4 2 18
5 1 14
6 . 15
7 9 19
end
tempvar x y // unique IDs for non-missing cases
egen x=group(ID SPO_ID)
egen y=group(SPO_ID ID)
egen DYAD=rowmin(x y)       // DYAD=> ID for each couple
sort DYAD
// distinguish within each DYAD; you can use sex/gender if given and all are heterosexual marriages
by DYAD: gen DYAD_ID=_n-1 if !missing(DYAD)
gen SPO_EDU=.
replace SPO_EDU=EDU[_n+1] if (DYAD_ID==0 & DYAD[_n]==DYAD[_n+1])
replace SPO_EDU=EDU[_n-1] if (DYAD_ID==1 & DYAD[_n]==DYAD[_n-1])
list

这是根据加州大学洛杉矶分校的idre改编的。

对于未来的问题，请发布Stata代码以及为什么它不适用于您，从而证明您已花费时间和精力研究您的问题。许多人认为只问代码的问题是离题的。同时，你也可以看到一些问题和答案，这些问题没有收到任何明显的反馈。@Roberto我明白你的意思，并同意你的看法。将在未来的问题中提供更多的信息代码/数据结构。Aspen，这里发生了一些事情。使用Maarten的数据，结果是不同的。很明显，当配偶ID上有未配对的、没有缺失值的案例时，就会发生这种情况。谢谢我会修改代码来解释这个场景。@Maarten和Aspen的回答也很有帮助，但这一个在我的案例中效果最好。将在未来的帖子中更清楚实际的数据结构，因为其中一些问题可能来自以下事实：我拥有的数据将整个家庭分为多个家庭（即，除了你的配偶，你还将与你的孩子分组），并且在每个家庭中，受试者的ID在不同的家庭中重复。

clear
// create some data and include cases without matches
input ID SPO_ID EDU
1 5 12
2 4 16
3 . 12
4 2 18
5 1 14
6 . 15
7 9 19
end
tempvar x y // unique IDs for non-missing cases
egen x=group(ID SPO_ID)
egen y=group(SPO_ID ID)
egen DYAD=rowmin(x y)       // DYAD=> ID for each couple
sort DYAD
// distinguish within each DYAD; you can use sex/gender if given and all are heterosexual marriages
by DYAD: gen DYAD_ID=_n-1 if !missing(DYAD)
gen SPO_EDU=.
replace SPO_EDU=EDU[_n+1] if (DYAD_ID==0 & DYAD[_n]==DYAD[_n+1])
replace SPO_EDU=EDU[_n-1] if (DYAD_ID==1 & DYAD[_n]==DYAD[_n-1])
list