Csv neo4j创建不提供输出的关系
我有两个csv文件:Csv neo4j创建不提供输出的关系,csv,neo4j,load,relationship,Csv,Neo4j,Load,Relationship,我有两个csv文件:实体,有280万条记录和Rships,有420万条记录实体有一个ENT\u ID和PARENTID的列表。如果ENT\u ID具有PARENTID“0”,则表示没有父项。如果是,那么它将是上面的ENT\u ID之一。我需要创建一个ENT\u ID和PARENTID的关系。我尝试通过Neo4j2.3.4社区版可用的导入工具加载它,但是我一直收到很多错误。最后,我使用loadcsv密码查询加载了它 USING PERIODIC COMMIT LOAD CSV WITH HEADE
实体
,有280万条记录和Rships
,有420万条记录<代码>实体有一个ENT\u ID
和PARENTID
的列表。如果ENT\u ID
具有PARENTID
“0”,则表示没有父项。如果是,那么它将是上面的ENT\u ID
之一。我需要创建一个ENT\u ID
和PARENTID
的关系。我尝试通过Neo4j2.3.4社区版可用的导入工具加载它,但是我一直收到很多错误。最后,我使用loadcsv密码查询加载了它
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/...(read file address here)/Entities.txt" AS Entity FIELDTERMINATOR '|'
CREATE (n:Entity{ENT_ID: Entity.ENT_ID,NAME: Entity.NAME,ENTRYTYPE: Entity.ENTRYTYPE, PARENTID: Entity.PARENTID,ENTRYCATEGORY: Entity.ENTRYCATEGORY,ENTRYSUBCATEGORY: Entity.ENTRYSUBCATEGORY,COUNTRY: Entity.COUNTRY,PWC_ADL_ID: Entity.PWC_ADL_ID })
我使用以下方法创建了PARENTID和ENTID之间的关系:
PROFILE
MATCH(Entity)
MATCH (a:Entity {ENT_ID : Entity.ENT_ID})
WITH Entity, a
MATCH (b:Entity {ENT_ID : Entity.PARENTID})
WITH a,b
MERGE (a)-[r:RELATION]->(b)
现在,我将关系的CSV文件加载为:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS Rships FIELDTERMINATOR '|'
CREATE (n:Rships{RID: Rships.RID,Ent_IDParent: Rships.Ent_IDParent,Ent_IDChild: Rships.Ent_IDChild, RelationID: Rships.RelationID })
我在两个CSV上创建了索引:
CREATE INDEX ON :Entity(ENT_ID)
CREATE INDEX ON :Rships(Ent_IDParent)
CREATE INDEX ON :Rships(Ent_IDChild)
到目前为止,上面的代码运行良好,尽管需要将近一天的时间:
现在,当我尝试使用查询创建关系时:
PROFILE
Match(Rships)
MATCH(a:Rships {ENT_IDParent: Rships.ENT_IDParent})
WITH Rships, a
MATCH(b:Rships {ENT_IDParent: Rships.ENT_IDChild})
WITH a,b
MERGE (a)-[r:RELATION]->(b)
此查询持续运行约一小时,没有任何结果
我们将不胜感激
感谢标签、属性和关系类型区分大小写。索引的拼写与语句中的属性不同 你也应该用解释或简介来看看你的陈述,然后你会立即看到它 对于第二条语句,我还将再次使用LOAD CSV来驱动实体的查找,以便定期提交 总的来说,导入所需的时间不应超过几分钟
// create unique constraint
CREATE CONSTRAINT ON (n:Entity) ASSERT n.ENT_ID IS UNIQUE;
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/...(read file address here)/Entities.txt" AS row FIELDTERMINATOR '|'
CREATE (n:Entity {ENT_ID: row.ENT_ID,NAME: row.NAME,ENTRYTYPE: row.ENTRYTYPE, PARENTID: row.PARENTID, ENTRYCATEGORY: row.ENTRYCATEGORY, ENTRYSUBCATEGORY: row.ENTRYSUBCATEGORY, COUNTRY: row.COUNTRY, PWC_ADL_ID: row.PWC_ADL_ID });
// you can also use this if you want to set all properties:
// CREATE (n:Entity) SET n = row
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/...(read file address here)/Entities.txt" AS row FIELDTERMINATOR '|'
MATCH (a:Entity {ENT_ID : row.ENT_ID})
MATCH (b:Entity {ENT_ID : row.PARENTID})
MERGE (a)-[:PARENT]->(b);
为什么要将关系创建为节点而不是关系,这对我来说毫无意义???
而不是:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|'
CREATE (n:Rships {RID: row.RID,Ent_IDParent: row.Ent_IDParent, Ent_IDChild: row.Ent_IDChild, RelationID: row.RelationID });
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|'
MATCH (a:Entity {ENT_ID : row.Ent_IDChild})
MATCH (b:Entity {ENT_ID : row.Ent_IDParent})
CREATE (a)-[:PARENT {RID: row.RID, RelationID: row.RelationID}]->(b);
我会这样做:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|'
CREATE (n:Rships {RID: row.RID,Ent_IDParent: row.Ent_IDParent, Ent_IDChild: row.Ent_IDChild, RelationID: row.RelationID });
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|'
MATCH (a:Entity {ENT_ID : row.Ent_IDChild})
MATCH (b:Entity {ENT_ID : row.Ent_IDParent})
CREATE (a)-[:PARENT {RID: row.RID, RelationID: row.RelationID}]->(b);
我在两个CSV上创建了索引:
CREATE INDEX ON :Entity(ENT_ID)
CREATE INDEX ON :Rships(Ent_IDParent)
CREATE INDEX ON :Rships(Ent_IDChild)
到目前为止,上面的代码运行良好,尽管需要将近一天的时间:
现在,当我尝试使用查询创建关系时:
//您在这个查询中还有一个输入错误,与父id的b也匹配
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|'
MATCH(a:Rships {Ent_IDParent: row.ENT_IDParent})
MATCH(b:Rships {Ent_IDChild: row.ENT_IDChild})
MERGE (a)-[:PARENT]->(b)
这两条语句毫无意义,您创建的是随机交叉乘积,概要文件应该显示行的巨大膨胀和数十亿次的数据库点击
MATCH(Entity) MATCH (a:Entity {ENT_ID : Entity.ENT_ID}) WITH Entity, a MATCH (b:Entity {ENT_ID : Entity.PARENTID}) WITH a,b MERGE (a)-[r:RELATION]->(b)
Match(Rships) MATCH(a:Rships {ENT_IDParent: Rships.ENT_IDParent}) WITH Rships, a MATCH(b:Rships {ENT_IDParent: Rships.ENT_IDChild}) WITH a,b MERGE (a)-[r:RELATION]->(b)
因此,我最终按照您提到的步骤进行了操作。它工作得很好。关系也是如此。我有30种不同的关系类型,每种类型都是基于RelationID描述的。有没有办法为每种关系类型自定义标签?还有一种方法可以加载另一个节点标签,其中包含RelationID和关系类型,以及连接两个表的排序。非常感谢你的帮助。非常感谢。另外,导入仍然需要大约35分钟,仅第一个语句总共需要35分钟。