Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/75.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 使用编程成对地推断信息_R_Database_Loops_Dataframe - Fatal编程技术网

R 使用编程成对地推断信息

R 使用编程成对地推断信息,r,database,loops,dataframe,R,Database,Loops,Dataframe,我想分析数据 我的数据库由1408(类型1为704,类型2为704)个观察值和49个变量组成。这是我数据库的一部分 重点是我想分析多收费的第1类(卖家)的性别 Data Subject ID Gender Period Matching group Group Type Overcharging 654 1 1 73 1 1 NA 654 1

我想分析数据

我的数据库由1408(类型1为704,类型2为704)个观察值和49个变量组成。这是我数据库的一部分

重点是我想分析多收费的第1类(卖家)的性别

 Data
 Subject ID  Gender   Period   Matching group   Group    Type  Overcharging
   654        1           1            73         1        1      NA
   654        1           2            73         1        1      NA
   654        1           3            73         1        1      NA
   654        1           4            73         1        1      NA 
   708        0           1            73         1        2       1
   708        0           2            73         1        2       0
   708        0           3            73         1        2       0
   708        0           4            73         1        2       1
   435        1           1            73         2        1      NA
   435        1           2            73         2        1      NA
   435        1           3            73         2        1      NA
   435        1           4            73         2        1      NA    
   546        0           1            73         2        2       0
   546        0           2            73         2        2       0
   546        0           3            73         2        2       1
   546        0           4            73         2        2       0
例如,如果您查看matchinggroup=73,则有两个组(1和2),每个组中有两种类型(1和2)。对于每种类型1(卖方),我们没有关于其行为的信息(是否滥收费用)。但是我们有关于买家(类型2)是否被多收费的信息

如果我能确定被过度对待的买家,那么,这意味着与这个买家互动的卖家已经过度对待了买家。所以我需要看的是卖方和买方同一组中的性别

例如,在匹配组73中,我们知道在第1阶段受试者708被多收费(第1组中的受试者)。据我所知,这名男子属于第1组和第73组,我能够确定向他多收费的卖家:受试者654,性别=1

在第2组(匹配第73组),我们知道在第3阶段,代理546被滥收费用。据我所知,这名男子属于第1组和第73组,我能够确定向他多收费的卖家:性别=1的受试者435。 .... 根据我的观察结果,我会这样做

然而,我真的不知道如何继续编码并在R上设置这个条件

这是我试图做的,但不符合我的需要

  for (matchinggroup[type==1]==matchinggroup[type==2] & 
group[type==1]==group[type==2] & period[type==1]==period[type==2])
  {
    if ((overtreatment==1), na.rm=TRUE)
sum(gender==1[type==1], na.rm=TRUE)
  }
我希望得到的预期结果是:

    sum(overcharging==1[gender==1&type==1])
    >3
    sum(overcharging==1[gender==0&type==1])
    >0
    sum(overcharging==0[gender==1&type==1])
    >5
    sum(overcharging==0[gender==0&type==1])
    >0

感谢您的时间和考虑!帮助是值得赞赏的。

不完全确定你想要的输出是什么,但是考虑一下:

Data <- read.table(header = T, 
                   text = "Subject_ID  Gender   Period   Matching_group   Group    Type  Overcharging
654        1           1            73         1        1      NA
654        1           2            73         1        1      NA
654        1           3            73         1        1      NA
654        1           4            73         1        1      NA 
708        0           1            73         1        2       1
708        0           2            73         1        2       0
708        0           3            73         1        2       0
708        0           4            73         1        2       1
435        1           1            73         2        1      NA
435        1           2            73         2        1      NA
435        1           3            73         2        1      NA
435        1           4            73         2        1      NA    
546        0           1            73         2        2       0
546        0           2            73         2        2       0
546        0           3            73         2        2       1
546        0           4            73         2        2       0
")

dat1 <- subset(Data, Overcharging==1)
我认为“for-loop”解决方案不适合R

我用data.table为您开发了另一个解决方案,将卖家和买家分开,然后加入他们

library(data.table)
Data <- data.table(read.table(header = T, 
                   text = "Subject_ID  Gender   Period   Matching_group   Group    Type  Overcharging
                   654        1           1            73         1        1      NA
                   654        1           2            73         1        1      NA
                   654        1           3            73         1        1      NA
                   654        1           4            73         1        1      NA 
                   708        0           1            73         1        2       1
                   708        0           2            73         1        2       0
                   708        0           3            73         1        2       0
                   708        0           4            73         1        2       1
                   435        1           1            73         2        1      NA
                   435        1           2            73         2        1      NA
                   435        1           3            73         2        1      NA
                   435        1           4            73         2        1      NA    
                   546        0           1            73         2        2       0
                   546        0           2            73         2        2       0
                   546        0           3            73         2        2       1
                   546        0           4            73         2        2       0
                   ")
)

Data[, SubjectType := ifelse(Type==1, "Seller", "Buyer")]
Subjects <- unique(Data[, .(Subject_ID, Gender)])
Matches <- dcast(Data, Matching_group+Group~SubjectType, value.var="Subject_ID", fun.aggregate = mean)

Buys <- Data[!is.na(Overcharging), .(Buyer = Subject_ID, BuyerGender = Gender, Period, Matching_group, Group, Overcharging)]
Buys <- merge(Buys, Matches, by=c("Buyer", "Matching_group", "Group"), all.x = T)
Buys <- merge(Buys, Subjects[, .(Seller = Subject_ID, SellerGender = Gender)], by="Seller", all.x = T)

Buys[Overcharging==0, .N, .(BuyerGender, SellerGender)]
Buys[Overcharging==1, .N, .(BuyerGender, SellerGender)]
库(data.table)

数据非常适合!非常感谢你。
    Subject_ID Gender Period Matching_group Group Type Overcharging
1         654      1      1             73     1    1           NA
4         654      1      4             73     1    1           NA
11        435      1      3             73     2    1           NA
library(data.table)
Data <- data.table(read.table(header = T, 
                   text = "Subject_ID  Gender   Period   Matching_group   Group    Type  Overcharging
                   654        1           1            73         1        1      NA
                   654        1           2            73         1        1      NA
                   654        1           3            73         1        1      NA
                   654        1           4            73         1        1      NA 
                   708        0           1            73         1        2       1
                   708        0           2            73         1        2       0
                   708        0           3            73         1        2       0
                   708        0           4            73         1        2       1
                   435        1           1            73         2        1      NA
                   435        1           2            73         2        1      NA
                   435        1           3            73         2        1      NA
                   435        1           4            73         2        1      NA    
                   546        0           1            73         2        2       0
                   546        0           2            73         2        2       0
                   546        0           3            73         2        2       1
                   546        0           4            73         2        2       0
                   ")
)

Data[, SubjectType := ifelse(Type==1, "Seller", "Buyer")]
Subjects <- unique(Data[, .(Subject_ID, Gender)])
Matches <- dcast(Data, Matching_group+Group~SubjectType, value.var="Subject_ID", fun.aggregate = mean)

Buys <- Data[!is.na(Overcharging), .(Buyer = Subject_ID, BuyerGender = Gender, Period, Matching_group, Group, Overcharging)]
Buys <- merge(Buys, Matches, by=c("Buyer", "Matching_group", "Group"), all.x = T)
Buys <- merge(Buys, Subjects[, .(Seller = Subject_ID, SellerGender = Gender)], by="Seller", all.x = T)

Buys[Overcharging==0, .N, .(BuyerGender, SellerGender)]
Buys[Overcharging==1, .N, .(BuyerGender, SellerGender)]