如何使用r中的igraph分别计算不同时段的网络测量值?
这是我的交易数据:如何使用r中的igraph分别计算不同时段的网络测量值?,r,date,igraph,social-networking,network-analysis,R,Date,Igraph,Social Networking,Network Analysis,这是我的交易数据: data: id from_id to_id amount date_trx <fctr> <fctr> <fctr> <dbl> <date> 0 7468 5695 700.0 2005-01-04 1 6213 9379
data:
id from_id to_id amount date_trx
<fctr> <fctr> <fctr> <dbl> <date>
0 7468 5695 700.0 2005-01-04
1 6213 9379 11832.0 2005-01-08
2 7517 8170 1000.0 2005-01-10
3 6143 9845 4276.0 2005-01-12
4 6254 9640 200.0 2005-01-14
5 6669 5815 200.0 2005-01-20
6 6934 8583 49752.0 2005-01-24
7 9240 8314 19961.0 2005-01-26
8 6374 8865 1000.0 2005-01-30
9 6143 6530 13.4 2005-01-31
...
但现在我想定期计算这些度量。例如,我想将我的数据除以周数(从第一个交易日期开始),并计算每个帐户对应周数的网络度量
data$week <- unsplit(tapply(data$date_trx, data$from_id, function(x) (as.numeric(x-min(data$trx_date)) %/% 7)+1),data$from_id)
select(data, from_id, to_id, date_trx, week, amount) %>% arrange(date_trx)
from_id to_id date_trx week amount
<fctr> <fctr> <date> <dbl> <dbl>
6644 6934 2005-01-01 1 700
6753 8456 2005-01-01 1 600
9242 9333 2005-01-01 1 1000
9843 9115 2005-01-01 1 900
7075 6510 2005-01-02 1 400
8685 7207 2005-01-02 1 1100
... ... ... ... ...
9866 6697 2010-12-31 313 95.8
9866 5992 2010-12-31 313 139.1
9866 5797 2010-12-31 313 72.1
9866 9736 2010-12-31 313 278.9
9868 8644 2010-12-31 313 242.8
9869 8399 2010-12-31 313 372.2
数据$week%排列(日期)
从\u id到\u id日期\u trx周金额
6644 6934 2005-01-01 1 700
6753 8456 2005-01-01 1 600
9242 9333 2005-01-01 1 1000
9843 9115 2005-01-01 1 900
7075 6510 2005-01-02 1 400
8685 7207 2005-01-02 1 1100
... ... ... ... ...
9866 6697 2010-12-31 313 95.8
9866 5992 2010-12-31 313 139.1
9866 5797 2010-12-31 313 72.1
9866 9736 2010-12-31 313 278.9
9868 8644 2010-12-31 313 242.8
9869 8399 2010-12-31 313 372.2
当我将数据划分为每周时段时,现在我需要分别形成每周的账户网络,这样我就可以计算每周时段账户的网络度量。如何在313周内一次性完成此操作?一种可能性是根据周分割数据,将每周转换为一个IGRAPHE对象,然后使用Lappy一次将中心度和度添加到所有图形中。我的初始data.frame名为d(见下文): 第一周:
d$week <- strftime(d$date_trx, format = "%V")
编写一个函数,执行所有要执行的操作,然后将其应用于每个图形:
my.funct <- function(x) {
V(x)$degree <- degree(x, normalized=TRUE)
V(x)$betweenness <- betweenness(x, normalized=TRUE)
V(x)$closeness <- closeness(x, normalized=TRUE)
return(x)
}
dd <- lapply(dd, my.funct)
然后,您可以检索所有周的所有中心度和学位:
ddd <- lapply(dd, function(x) igraph::as_data_frame(x, what = "vertices") )
# keep in mind that `split` names the objects in the list according to
# the value it used to split, therefore the name of the data.frames in
# the list is the name of the week.
library(dplyr)
ddd <- bind_rows(ddd, .id="week")
head(ddd)
week name degree betweenness closeness
1 01 E 1.4444444 0 0.2000000
2 01 D 1.5555556 0 0.1666667
3 01 B 0.7777778 0 0.2000000
4 01 A 1.0000000 0 0.2000000
5 01 C 0.7777778 0 0.1666667
6 01 F 1.0000000 0 0.1000000
ddd请注意,运行my.funct可能会返回一些警告。这是由于虚拟数据的选择不够理想。例如:对于断开连接的图,贴近度中心并没有很好的定义。太好了,谢谢!现在,在将网络度量合并为数据帧之后,我需要将这些度量作为一个特性添加到数据中。也就是说,network\u data是的,使用作为\u data\u frame
和lapply
应该可以工作。将其添加到answerHi@desval,我刚刚上传了一篇关于这个主题的文章,你能检查一下吗?
dd <- split(d, d$week )
dd <- lapply(dd, function(x) graph_from_data_frame(x, directed = T))
my.funct <- function(x) {
V(x)$degree <- degree(x, normalized=TRUE)
V(x)$betweenness <- betweenness(x, normalized=TRUE)
V(x)$closeness <- closeness(x, normalized=TRUE)
return(x)
}
dd <- lapply(dd, my.funct)
dd[[1]]
IGRAPH f515e52 DN-- 4 2 --
+ attr: name (v/c), degree (v/n), betweenness (v/n), closeness (v/n), weigth (e/n), date_trx
| (e/n), week (e/c)
+ edges from f515e52 (vertex names):
[1] B->F C->G
get.vertex.attribute(dd[[1]])
$name
[1] "B" "C" "F" "G"
$degree
[1] 0.3333333 0.3333333 0.3333333 0.3333333
$betweenness
[1] 0 0 0 0
$closeness
[1] 0.3333333 0.3333333 0.2500000 0.2500000
get.edge.attribute(dd[[1]])
$weight
[1] 9 7
$date_trx
[1] 10595 10601
$week
[1] "01" "01"
ddd <- lapply(dd, function(x) igraph::as_data_frame(x, what = "vertices") )
# keep in mind that `split` names the objects in the list according to
# the value it used to split, therefore the name of the data.frames in
# the list is the name of the week.
library(dplyr)
ddd <- bind_rows(ddd, .id="week")
head(ddd)
week name degree betweenness closeness
1 01 E 1.4444444 0 0.2000000
2 01 D 1.5555556 0 0.1666667
3 01 B 0.7777778 0 0.2000000
4 01 A 1.0000000 0 0.2000000
5 01 C 0.7777778 0 0.1666667
6 01 F 1.0000000 0 0.1000000
set.seed(123)
d <- data.frame(from_id = sample(LETTERS[1:5], 2000, replace = T),
to_id = sample(LETTERS[6:10], 2000, replace = T),
weight = rpois(2000, 10),
date_trx = sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 2000, replace = T))