R 使用编程成对地推断信息
我想分析数据 我的数据库由1408(类型1为704,类型2为704)个观察值和49个变量组成。这是我数据库的一部分 重点是我想分析多收费的第1类(卖家)的性别R 使用编程成对地推断信息,r,database,loops,dataframe,R,Database,Loops,Dataframe,我想分析数据 我的数据库由1408(类型1为704,类型2为704)个观察值和49个变量组成。这是我数据库的一部分 重点是我想分析多收费的第1类(卖家)的性别 Data Subject ID Gender Period Matching group Group Type Overcharging 654 1 1 73 1 1 NA 654 1
Data
Subject ID Gender Period Matching group Group Type Overcharging
654 1 1 73 1 1 NA
654 1 2 73 1 1 NA
654 1 3 73 1 1 NA
654 1 4 73 1 1 NA
708 0 1 73 1 2 1
708 0 2 73 1 2 0
708 0 3 73 1 2 0
708 0 4 73 1 2 1
435 1 1 73 2 1 NA
435 1 2 73 2 1 NA
435 1 3 73 2 1 NA
435 1 4 73 2 1 NA
546 0 1 73 2 2 0
546 0 2 73 2 2 0
546 0 3 73 2 2 1
546 0 4 73 2 2 0
例如,如果您查看matchinggroup=73,则有两个组(1和2),每个组中有两种类型(1和2)。对于每种类型1(卖方),我们没有关于其行为的信息(是否滥收费用)。但是我们有关于买家(类型2)是否被多收费的信息
如果我能确定被过度对待的买家,那么,这意味着与这个买家互动的卖家已经过度对待了买家。所以我需要看的是卖方和买方同一组中的性别
例如,在匹配组73中,我们知道在第1阶段受试者708被多收费(第1组中的受试者)。据我所知,这名男子属于第1组和第73组,我能够确定向他多收费的卖家:受试者654,性别=1
在第2组(匹配第73组),我们知道在第3阶段,代理546被滥收费用。据我所知,这名男子属于第1组和第73组,我能够确定向他多收费的卖家:性别=1的受试者435。
....
根据我的观察结果,我会这样做
然而,我真的不知道如何继续编码并在R上设置这个条件
这是我试图做的,但不符合我的需要
for (matchinggroup[type==1]==matchinggroup[type==2] &
group[type==1]==group[type==2] & period[type==1]==period[type==2])
{
if ((overtreatment==1), na.rm=TRUE)
sum(gender==1[type==1], na.rm=TRUE)
}
我希望得到的预期结果是:
sum(overcharging==1[gender==1&type==1])
>3
sum(overcharging==1[gender==0&type==1])
>0
sum(overcharging==0[gender==1&type==1])
>5
sum(overcharging==0[gender==0&type==1])
>0
感谢您的时间和考虑!帮助是值得赞赏的。
不完全确定你想要的输出是什么,但是考虑一下:
Data <- read.table(header = T,
text = "Subject_ID Gender Period Matching_group Group Type Overcharging
654 1 1 73 1 1 NA
654 1 2 73 1 1 NA
654 1 3 73 1 1 NA
654 1 4 73 1 1 NA
708 0 1 73 1 2 1
708 0 2 73 1 2 0
708 0 3 73 1 2 0
708 0 4 73 1 2 1
435 1 1 73 2 1 NA
435 1 2 73 2 1 NA
435 1 3 73 2 1 NA
435 1 4 73 2 1 NA
546 0 1 73 2 2 0
546 0 2 73 2 2 0
546 0 3 73 2 2 1
546 0 4 73 2 2 0
")
dat1 <- subset(Data, Overcharging==1)
我认为“for-loop”解决方案不适合R
我用data.table为您开发了另一个解决方案,将卖家和买家分开,然后加入他们
library(data.table)
Data <- data.table(read.table(header = T,
text = "Subject_ID Gender Period Matching_group Group Type Overcharging
654 1 1 73 1 1 NA
654 1 2 73 1 1 NA
654 1 3 73 1 1 NA
654 1 4 73 1 1 NA
708 0 1 73 1 2 1
708 0 2 73 1 2 0
708 0 3 73 1 2 0
708 0 4 73 1 2 1
435 1 1 73 2 1 NA
435 1 2 73 2 1 NA
435 1 3 73 2 1 NA
435 1 4 73 2 1 NA
546 0 1 73 2 2 0
546 0 2 73 2 2 0
546 0 3 73 2 2 1
546 0 4 73 2 2 0
")
)
Data[, SubjectType := ifelse(Type==1, "Seller", "Buyer")]
Subjects <- unique(Data[, .(Subject_ID, Gender)])
Matches <- dcast(Data, Matching_group+Group~SubjectType, value.var="Subject_ID", fun.aggregate = mean)
Buys <- Data[!is.na(Overcharging), .(Buyer = Subject_ID, BuyerGender = Gender, Period, Matching_group, Group, Overcharging)]
Buys <- merge(Buys, Matches, by=c("Buyer", "Matching_group", "Group"), all.x = T)
Buys <- merge(Buys, Subjects[, .(Seller = Subject_ID, SellerGender = Gender)], by="Seller", all.x = T)
Buys[Overcharging==0, .N, .(BuyerGender, SellerGender)]
Buys[Overcharging==1, .N, .(BuyerGender, SellerGender)]
库(data.table)
数据非常适合!非常感谢你。
Subject_ID Gender Period Matching_group Group Type Overcharging
1 654 1 1 73 1 1 NA
4 654 1 4 73 1 1 NA
11 435 1 3 73 2 1 NA
library(data.table)
Data <- data.table(read.table(header = T,
text = "Subject_ID Gender Period Matching_group Group Type Overcharging
654 1 1 73 1 1 NA
654 1 2 73 1 1 NA
654 1 3 73 1 1 NA
654 1 4 73 1 1 NA
708 0 1 73 1 2 1
708 0 2 73 1 2 0
708 0 3 73 1 2 0
708 0 4 73 1 2 1
435 1 1 73 2 1 NA
435 1 2 73 2 1 NA
435 1 3 73 2 1 NA
435 1 4 73 2 1 NA
546 0 1 73 2 2 0
546 0 2 73 2 2 0
546 0 3 73 2 2 1
546 0 4 73 2 2 0
")
)
Data[, SubjectType := ifelse(Type==1, "Seller", "Buyer")]
Subjects <- unique(Data[, .(Subject_ID, Gender)])
Matches <- dcast(Data, Matching_group+Group~SubjectType, value.var="Subject_ID", fun.aggregate = mean)
Buys <- Data[!is.na(Overcharging), .(Buyer = Subject_ID, BuyerGender = Gender, Period, Matching_group, Group, Overcharging)]
Buys <- merge(Buys, Matches, by=c("Buyer", "Matching_group", "Group"), all.x = T)
Buys <- merge(Buys, Subjects[, .(Seller = Subject_ID, SellerGender = Gender)], by="Seller", all.x = T)
Buys[Overcharging==0, .N, .(BuyerGender, SellerGender)]
Buys[Overcharging==1, .N, .(BuyerGender, SellerGender)]