边属性从ETL删除OrientDB中的顶点属性

边属性从ETL删除OrientDB中的顶点属性,orientdb,orientdb2.2,orientdb-etl,Orientdb,Orientdb2.2,Orientdb Etl,这是我发布的一篇关于使用ETL将简单数据库导入OrientDB的文章,ETL同时具有边和顶点属性,并且在这两个属性上都有日期 以下是我的数据: 顶点.csv: label,data,date v01,0.1234,2015-01-01 v02,0.5678,2015-01-02 v03,0.9012,2015-01-03 u,v,weight,date v01,v02,12.4,2015-06-17 v02,v03,17.9,2015-09-14 { "begin": [ { "le

这是我发布的一篇关于使用ETL将简单数据库导入OrientDB的文章,ETL同时具有边和顶点属性,并且在这两个属性上都有日期

以下是我的数据:

顶点.csv

label,data,date
v01,0.1234,2015-01-01
v02,0.5678,2015-01-02
v03,0.9012,2015-01-03
u,v,weight,date
v01,v02,12.4,2015-06-17
v02,v03,17.9,2015-09-14
{
    "begin": [ { "let": { "name": "$filePath", "expression": "$fileDirectory.append($fileName )" } } ],
    "config": { "log": "info" },
    "source": { "file": { "path": "$filePath" } },
    "extractor": { "csv": { "ignoreEmptyLines": true,
                            "nullValue": "N/A",
                            "dateFormat": "yyyy-mm-dd"
                          }
                 },
    "transformers": [
          { "merge":  { "joinFieldName": "u", "lookup": "myVertex.label" } },
          { "edge":   { "class":         "myEdge",
                        "joinFieldName": "v",
                        "lookup":        "myVertex.label",
                        "edgeFields":    { "weight": "${input.weight}", "date": "${input.date}" },
                        "direction":     "out",
                        "unresolvedLinkAction": "NOTHING"
                      }
          },
          { "field": { "fieldNames": ["u", "v"], "operation": "remove" } }
    ],
    "loader": {
        "orientdb": {
            "dbURL": "plocal:my_orientdb",
            "dbType": "graph",
            "batchCommit": 1000,
            "useLightweightEdges": false,
            "classes": [ { "name": "myEdge",   "extends", "E" } ],
            "indexes": []
        }
    }
}
edges.csv

label,data,date
v01,0.1234,2015-01-01
v02,0.5678,2015-01-02
v03,0.9012,2015-01-03
u,v,weight,date
v01,v02,12.4,2015-06-17
v02,v03,17.9,2015-09-14
{
    "begin": [ { "let": { "name": "$filePath", "expression": "$fileDirectory.append($fileName )" } } ],
    "config": { "log": "info" },
    "source": { "file": { "path": "$filePath" } },
    "extractor": { "csv": { "ignoreEmptyLines": true,
                            "nullValue": "N/A",
                            "dateFormat": "yyyy-mm-dd"
                          }
                 },
    "transformers": [
          { "merge":  { "joinFieldName": "u", "lookup": "myVertex.label" } },
          { "edge":   { "class":         "myEdge",
                        "joinFieldName": "v",
                        "lookup":        "myVertex.label",
                        "edgeFields":    { "weight": "${input.weight}", "date": "${input.date}" },
                        "direction":     "out",
                        "unresolvedLinkAction": "NOTHING"
                      }
          },
          { "field": { "fieldNames": ["u", "v"], "operation": "remove" } }
    ],
    "loader": {
        "orientdb": {
            "dbURL": "plocal:my_orientdb",
            "dbType": "graph",
            "batchCommit": 1000,
            "useLightweightEdges": false,
            "classes": [ { "name": "myEdge",   "extends", "E" } ],
            "indexes": []
        }
    }
}
为了简洁起见,我将使用另一个问题中的编辑添加更新后的commonEdges.json文件。其他JSON文件保持不变

commonEdge.json

label,data,date
v01,0.1234,2015-01-01
v02,0.5678,2015-01-02
v03,0.9012,2015-01-03
u,v,weight,date
v01,v02,12.4,2015-06-17
v02,v03,17.9,2015-09-14
{
    "begin": [ { "let": { "name": "$filePath", "expression": "$fileDirectory.append($fileName )" } } ],
    "config": { "log": "info" },
    "source": { "file": { "path": "$filePath" } },
    "extractor": { "csv": { "ignoreEmptyLines": true,
                            "nullValue": "N/A",
                            "dateFormat": "yyyy-mm-dd"
                          }
                 },
    "transformers": [
          { "merge":  { "joinFieldName": "u", "lookup": "myVertex.label" } },
          { "edge":   { "class":         "myEdge",
                        "joinFieldName": "v",
                        "lookup":        "myVertex.label",
                        "edgeFields":    { "weight": "${input.weight}", "date": "${input.date}" },
                        "direction":     "out",
                        "unresolvedLinkAction": "NOTHING"
                      }
          },
          { "field": { "fieldNames": ["u", "v"], "operation": "remove" } }
    ],
    "loader": {
        "orientdb": {
            "dbURL": "plocal:my_orientdb",
            "dbType": "graph",
            "batchCommit": 1000,
            "useLightweightEdges": false,
            "classes": [ { "name": "myEdge",   "extends", "E" } ],
            "indexes": []
        }
    }
}
在我加载图表后,日期字段仍然被删除

这是顶点表,如果我不加载边:

orientdb {db=my_orientdb}> SELECT FROM myVertex

+----+-----+--------+------+-------------------+-----+
|#   |@RID |@CLASS  |data  |date               |label|
+----+-----+--------+------+-------------------+-----+
|0   |#25:0|myVertex|0.1234|2015-01-01 00:01:00|v01  |
|1   |#26:0|myVertex|0.5678|2015-01-02 00:01:00|v02  |
|2   |#27:0|myVertex|0.9012|2015-01-03 00:01:00|v03  |
+----+-----+--------+------+-------------------+-----+
看起来一切正常,日期是1/1/15-1/3/15

但在加载边缘后,日期字段错误:

orientdb {db=my_orientdb}> SELECT FROM myVertex

+----+-----+--------+------+-------------------+-----+------+----------+---------+
|#   |@RID |@CLASS  |data  |date               |label|weight|out_myEdge|in_myEdge|
+----+-----+--------+------+-------------------+-----+------+----------+---------+
|0   |#25:0|myVertex|0.1234|2015-01-17 00:06:00|v01  |12.4  |[#33:0]   |         |
|1   |#26:0|myVertex|0.5678|2015-01-14 00:09:00|v02  |17.9  |[#34:0]   |[#33:0]  |
|2   |#27:0|myVertex|0.9012|2015-01-03 00:01:00|v03  |      |          |[#34:0]  |
+----+-----+--------+------+-------------------+-----+------+----------+---------+
边缘上的日期也不正确:

orientdb {db=my_orientdb}> SELECT FROM myEdge

+----+-----+------+-----+-----+------+-------------------+
|#   |@RID |@CLASS|out  |in   |weight|date               |
+----+-----+------+-----+-----+------+-------------------+
|0   |#33:0|myEdge|#25:0|#26:0|12.4  |2015-01-17 00:06:00|
|1   |#34:0|myEdge|#26:0|#27:0|17.9  |2015-01-14 00:09:00|
+----+-----+------+-----+-----+------+-------------------+
看起来OrientDB正在用已经加载的日期来填充月份的某一天。。。但是,从边缘开始的月份字段不知何故被放入了分钟字段。它也以这种方式显示顶点和边

这只是OrientDB的一个bug,还是我在ETL文件中遗漏了什么


提前感谢您提供的任何帮助或建议。

我尝试过,我认为这是一个bug,您能在github上打开一个问题吗?我可以研究一下。。。另外,在尝试不同的东西时,我发现对于OrientDB的ETL,日期解析器看起来完全崩溃了。如果我在CSV文件中将顶点数据更改为非一月的月份,例如2015-06-17,并且仅加载vertex.CSV。。。我会在数据库中找到2015-01-17。我不确定这是同一个问题还是完全不同的问题。