Loops 在以前发生的值之间查找匹配项_Loops_Stata

Loops 在以前发生的值之间查找匹配项

loops stata

Loops 在以前发生的值之间查找匹配项,loops,stata,Loops,Stata,我有以下数据集 time person1_person_2 person2_person1 occurrence cell_count 1 A_B B_A 0 1 2 A_C C_A 0 2 3 B_A A_B 1

我有以下数据集

time  person1_person_2   person2_person1   occurrence   cell_count
  1        A_B                B_A               0           1
  2        A_C                C_A               0           2
  3        B_A                A_B               1           3
  4        E_A                A_E               0           4
  5        C_A                A_C               1           5
  6        E_A                A_E               0           6
  7        A_B                B_A               1           7

在Stata中，我试图创建

事件

变量。如果

person1\u person2

较早出现在

person2\u person1

中，则取

的值。例如，如果在

time=4

和

time=6

出现

时，由于

E\u A

字段

person2\u person1

中未出现，因此取值

我试过了，但运气不好：

gen occurrence = 0 
local i = cell_count-1
foreach j in `i' {
replace occurrence = 1 if person1_person2 == person2_person1[_n-`j']
}

正如您所猜测的，实现这一点的一种方法是使用循环

clear 
input time  str3 person1_person2   str3 person2_person1   
1        A_B                B_A               
2        A_C                C_A               
3        B_A                A_B               
4        E_A                A_E               
5        C_A                A_C               
6        E_A                A_E               
7        A_B                B_A               
end 

gen occurrence = 0 

qui forval i = 2/`=_N' { 
    local I = `i' - 1 
    count if person2_person1 == person1_person2[`i'] in 1/`I' 
    if r(N) replace occurrence = 1 in `i' 
}

如果r（N）

等于

如果r（N）>0

因为

r（N）

为真（非零）和为正是一个相同的值，因为计数永远不能为负

r（N）

是

count

在内存中留下的结果。有关

count

的教程，请参见和

您的代码包括以下行

local i = cell_count-1
foreach j in `i' {

第一个将被评估为

local i = cell_count[1] - 1

结果是0，所以你的循环就是

foreach j in 0 {

单线也是如此

replace occurrence = 1 if person1_person2 == person2_person1[_n]

或

它测试同时相等。你需要的不是运气，而是逻辑

诚然，这并不像@Nick Cox的解决方案那么简单，但总体思路相当简单：记录

p2\u p1

每个值首次出现的时间，然后与

p1\u p2

当前值的时间进行比较

注意这里没有显式循环，这是我一直在探索的。这并不一定意味着它更有效率

clear all
set more off

clear 
input time  str3 p1_p2   str3 p2_p1   
1        A_B                B_A               
2        A_C                C_A               
3        B_A                A_B               
4        E_A                A_E               
5        C_A                A_C               
6        E_A                A_E               
7        A_B                B_A     
8        P_M                M_P             
9        A_B                B_A
end 

list

tempfile main
save "`main'"

* Create auxiliary data of first occurrences
bysort p2_p1 (time): gen firstflag = (_n == 1) // flag if first occurrence
drop if firstflag == 0 // drop if not first occurrence
drop firstflag p1_p2 // drop unnecessary variables
rename time firsttime // rename accordingly
rename p2_p1 p1_p2 // needed for -merge-

tempfile aux
save "`aux'"

* Merge main data with auxiliary
use "`main'", clear
merge m:1 p1_p2 using "`aux'", keep(master match)

* Compute variable of interest
gen ocurr = (firsttime < time)

* List
drop firsttime _merge
sort time
list

全部清除
激起更多
清楚的
输入时间str3 p1\u p2 str3 p2\u p1
1 A_B_A
2 A_C_A
3 B_A_B
4 E_A_E
5 C_A_C
6 E_A_E
7 A_B_A
8个下午
9 A_B_A
结束
列表
临时文件主
保存“`main'”
*创建第一次出现的辅助数据
bysort p2_p1（时间）：gen firstflag=（\u n==1）//第一次出现时的标志
如果firstflag==0，则删除//如果不是第一次出现，则删除
drop firstflag p1\u p2//删除不必要的变量
重命名时间firsttime//相应地重命名
重命名p2_p1_p2//needed for-merge-
临时文件辅助
保存“`aux'”
*将主数据与辅助数据合并
使用“`main'”清除
使用“aux”合并m:1p1\u p2，保留（主匹配）
*计算感兴趣的变量
gen ocurr=（首次<时间）
*名单
第一次删除\u合并
排序时间
列表

在堆栈溢出中，“谢谢”最好表示为“接受”和/或“向上投票”答案（请注意，这是两件不同的事情）。这不是强制性的，但这是这里的想法。我知道，读一读，读一读。@NickCox；但user1911813明确表示他的问题已经解决了。这是为了提醒他可以“接受”，而不是发表“只谢谢你”的评论。我们被要求不要这样做。我知道这个建议，这是好的。我个人的看法是（仍然是）“谢谢”胜过没有回答，特别是当它包含了一个信号，表明建议是好的。我同意@NickCox。在这种情况下，有总比没有好。我还认为，根据堆栈溢出目标，还有容易改进的空间。“表示感谢”是一种超越虚无的进步，但“接受”是一种超越“表示感谢”的进步，无论这是多么客观。无论在何种情况下，我都希望被恭敬地提醒/告知改善的机会。我相信，总的来说，这是件好事。我的目标是，绝对的。我们完全同意。

clear all
set more off

clear 
input time  str3 p1_p2   str3 p2_p1   
1        A_B                B_A               
2        A_C                C_A               
3        B_A                A_B               
4        E_A                A_E               
5        C_A                A_C               
6        E_A                A_E               
7        A_B                B_A     
8        P_M                M_P             
9        A_B                B_A
end 

list

tempfile main
save "`main'"

* Create auxiliary data of first occurrences
bysort p2_p1 (time): gen firstflag = (_n == 1) // flag if first occurrence
drop if firstflag == 0 // drop if not first occurrence
drop firstflag p1_p2 // drop unnecessary variables
rename time firsttime // rename accordingly
rename p2_p1 p1_p2 // needed for -merge-

tempfile aux
save "`aux'"

* Merge main data with auxiliary
use "`main'", clear
merge m:1 p1_p2 using "`aux'", keep(master match)

* Compute variable of interest
gen ocurr = (firsttime < time)

* List
drop firsttime _merge
sort time
list