创建可传递闭包表的MYSQL代码失败

创建可传递闭包表的MYSQL代码失败,mysql,snomed-ct,Mysql,Snomed Ct,我正在使用当前的SNOMED数据和示例,我想创建一个可传递的闭包表,但mysql5.6默认服务器设置中的某些内容失败了 对于那些不知道的人来说,SNOMED是一个医学数据库。 有210万个关系和446697个概念。查询在第二部分暂停-因此我猜它的内存不足。但是我应该调整哪些设置,调整到什么?加入缓冲区大小 代码如下: DELIMITER ;; CREATE DEFINER=`snomed`@`localhost` PROCEDURE `createTc`() BEGIN drop tab

我正在使用当前的SNOMED数据和示例,我想创建一个可传递的闭包表,但mysql5.6默认服务器设置中的某些内容失败了

对于那些不知道的人来说,SNOMED是一个医学数据库。 有210万个关系和446697个概念。查询在第二部分暂停-因此我猜它的内存不足。但是我应该调整哪些设置,调整到什么?加入缓冲区大小

代码如下:

DELIMITER ;;
CREATE DEFINER=`snomed`@`localhost` PROCEDURE `createTc`()
BEGIN
    drop table if exists tc;

    CREATE TABLE tc (
        source BIGINT UNSIGNED NOT NULL ,
        dest BIGINT UNSIGNED NOT NULL
        ) ENGINE = InnoDB CHARSET=utf8;
    insert into tc (source, dest)
        select distinct rel.sourceid, rel.destinationid
        from rf2_ss_relationships rel
        inner join rf2_ss_concepts con
            on rel.sourceid = con.id and con.active = 1
        where rel.typeid = 116680003 # IS A relationship
        and rel.active = 1;
    REPEAT
        insert into tc (source, dest)
            select distinct b.source, a.dest
            from tc a
            join tc b on a.source = b.dest
            left join tc c on c.source = b.source and c.dest = a.dest
            where c.source is null;
        set @x = row_count();
        select concat('Inserted ', @x);
    UNTIL @x = 0 END REPEAT;
    create index idx_tc_source on tc (source);
    create index idx_tc_dest on tc (dest);
END;;
DELIMITER ;
CREATE TABLE `rf2_ss_relationships` (
  `id` bigint(20) unsigned NOT NULL,
  `effectiveTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `active` tinyint(4) DEFAULT '1',
  `moduleId` bigint(20) unsigned NOT NULL,
  `sourceId` bigint(20) unsigned NOT NULL,
  `destinationId` bigint(20) unsigned NOT NULL,
  `relationshipGroup` bigint(20) unsigned NOT NULL,
  `typeId` bigint(20) unsigned NOT NULL,
  `characteristicTypeId` bigint(20) unsigned NOT NULL,
  `modifierId` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`,`effectiveTime`),
  KEY `moduleId_idx` (`moduleId`),
  KEY `sourceId_idx` (`sourceId`),
  KEY `destinationId_idx` (`destinationId`),
  KEY `relationshipGroup_idx` (`relationshipGroup`),
  KEY `typeId_idx` (`typeId`),
  KEY `characteristicTypeId_idx` (`characteristicTypeId`),
  KEY `modifierId_idx` (`modifierId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

 CREATE TABLE `rf2_ss_concepts` (
  `id` bigint(20) unsigned NOT NULL,
  `effectiveTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `active` tinyint(4) DEFAULT NULL,
  `moduleId` bigint(20) unsigned NOT NULL,
  `definitionStatusId` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`,`effectiveTime`),
  KEY `moduleId_idx` (`moduleId`),
  KEY `definitionStatusId_idx` (`definitionStatusId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

我不知道这是否是最好的答案,但它确实有效。。。 我更改了createtable语法,在创建时添加索引,而不是在完成之后。我更改了innodb_buffer_pool_size=8G的mysqld设置

    CREATE TABLE tc (
    source BIGINT UNSIGNED NOT NULL ,
    dest BIGINT UNSIGNED NOT NULL,
    KEY source_idx (source),
    KEy dest_idx (dest)
    ) ENGINE = InnoDB CHARSET=utf8;
即使在我的i7 mac和SSD上,执行速度也不快,但确实有效,可传递闭包表是5180059行

mysql> call createTc;
+-------------------------+
| concat('Inserted ', @x) |
+-------------------------+
| Inserted 654161         |
+-------------------------+
1 row in set (1 min 55.13 sec)

+-------------------------+
| concat('Inserted ', @x) |
+-------------------------+
| Inserted 1752024        |
+-------------------------+
1 row in set (3 min 5.60 sec)

+-------------------------+
| concat('Inserted ', @x) |
+-------------------------+
| Inserted 2063816        |
+-------------------------+
1 row in set (10 min 42.07 sec)

+-------------------------+
| concat('Inserted ', @x) |
+-------------------------+
| Inserted 275904         |
+-------------------------+
1 row in set (28 min 5.49 sec)

+-------------------------+
| concat('Inserted ', @x) |
+-------------------------+
| Inserted 280            |
+-------------------------+
1 row in set (46 min 29.78 sec)

+-------------------------+
| concat('Inserted ', @x) |
+-------------------------+
| Inserted 0              |
+-------------------------+
1 row in set (1 hour 5 min 20.05 sec)

Query OK, 0 rows affected (1 hour 5 min 20.05 sec)

我使用这种递归方法。在阅读关系时,我已经将所有直接子体添加到概念对象(我使用hibernate)中的列表中,所以它们是可用的

然后我通过遍历概念列表来启动这个递归函数。 参见示例。在所有直系父母的名单中,每个概念都有:

for (Sct2Concept c : concepts.values()) {
    for(Sct2Relationship parentRelation : c.getChildOfRelationships()){
        addParentToList(concepts, sct2TransitiveClosureList, parentRelation, c);
    }
}
如您所见,TransitiveClosure内存存储是一个集合,因此可以检查智能和非常成熟的Java库内部代码上的唯一值

private void addParentToList(Map<String, Sct2Concept> concepts, Set<Sct2TransitiveClosure> sct2TransitiveClosureList, Sct2Relationship parentRelation, Sct2Concept c){
        if(!parentRelation.isActive())
            return;
        Sct2TransitiveClosure tc = new Sct2TransitiveClosure(parentRelation.getDestinationSct2Concept().getId(), c.getId());
        if(sct2TransitiveClosureList.add(tc)){
            Sct2Concept s = concepts.get(Long.toString(tc.getParentId()));
            for(Sct2Relationship newParentRelation : s.getChildOfRelationships()){
                addParentToList(concepts, sct2TransitiveClosureList, newParentRelation, c);
        }
    }
}
private void addParentToList(映射概念、设置sct2TransitiveClosureList、Sct2Relationship ParentRelationship、Sct2Concept c){
如果(!parentRelation.isActive())
返回;
Sct2TransitiveClosure tc=新的Sct2TransitiveClosure(parentRelation.getDestinationSct2Concept().getId(),c.getId());
if(sct2TransitiveClosureList.add(tc)){
Sct2Concept s=concepts.get(Long.toString(tc.getParentId());
for(Sct2Relationship newParentRelationship:s.getChildOfRelationships()){
addParentToList(概念,sct2TransitiveClosureList,newParentRelation,c);
}
}
}

在此之后,我发现了一个java实现,可以更快地解决问题。检查这个:我现在使用这个:在java中执行传递闭包,然后导入文本文件。它比mysql快得多。