Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
将XML数据导入PostgreSQL 9.5.12、Ubuntu 16.04.4的问题_Xml_Postgresql_Xpath - Fatal编程技术网

将XML数据导入PostgreSQL 9.5.12、Ubuntu 16.04.4的问题

将XML数据导入PostgreSQL 9.5.12、Ubuntu 16.04.4的问题,xml,postgresql,xpath,Xml,Postgresql,Xpath,注意:这是一个关于在中尝试执行建议时遇到的错误的问题 我正在尝试导入一行XML文件,以便测试导入所有行所需的代码,这些行应该超过600000行。我的XML如下所示: 尝试执行此语句会导致以下错误: [2018-03-26 19:42:50] Using batch mode (1000 insert/update/delete statements max) SELECT (xpath('//objectid/text()', myTempTable.myXmlColumn))[1]::tex

注意:这是一个关于在中尝试执行建议时遇到的错误的问题

我正在尝试导入一行XML文件,以便测试导入所有行所需的代码,这些行应该超过600000行。我的XML如下所示:

尝试执行此语句会导致以下错误:

[2018-03-26 19:42:50] Using batch mode (1000 insert/update/delete statements max)
SELECT
(xpath('//objectid/text()', myTempTable.myXmlColumn))[1]::text AS objectid,
(xpath('//parcelid/text()', myTempTable.myXmlColumn))[1]::text AS parcelid,
(xpath('//kivapin/text()', myTempTable.myXmlColumn))[1]::text AS kivapin,
...
[2018-03-26 19:42:50] [42601] ERROR: syntax error at or near "myTempTable"
[2018-03-26 19:42:50] Position: 2058
[2018-03-26 19:42:50] Summary: 1 of 1 statements executed, 1 failed in 380ms (2293 symbols in file)
我认为这可能是代码体中的一些语法错误的问题,所以我只运行了第一个
xpath
语句,但这会产生一个错误:

[2018-03-26 19:46:17] Using batch mode (1000 insert/update/delete statements max)
SELECT
(xpath('//objectid/text()', myTempTable.myXmlColumn))[1]::text AS objectid,
myTempTable.myXmlColumn as myXmlElement
FROM unnest(
'//row'
,XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('parcel_data_first_row.xml'), 'UTF8...
[2018-03-26 19:46:17] [42804] ERROR: could not determine polymorphic type because input has type "unknown"
[2018-03-26 19:46:17] Summary: 1 of 1 statements executed, 1 failed in 385ms (273 symbols in file)
我不太清楚接下来该怎么办

t=# select pg_get_function_arguments(oid),oid::regprocedure from pg_proc where proname = 'pg_read_binary_file';
   pg_get_function_arguments   |                       oid
-------------------------------+-------------------------------------------------
 text, bigint, bigint          | pg_read_binary_file(text,bigint,bigint)
 text, bigint, bigint, boolean | pg_read_binary_file(text,bigint,bigint,boolean)
 text                          | pg_read_binary_file(text)
(3 rows)
尝试将
pg\u read\u二进制文件('parcel\u data\u first\u row.xml')
转换为:

pg_read_binary_file('parcel_data_first_row.xml'::text)

一旦您的表中已经有了XML文档,您就可以使用如下方式对其进行解析:

 WITH j AS (SELECT UNNEST(XPATH('//row',myXmlColumn)) AS myXmlColumn
 FROM myTempTable)
 SELECT
      (xpath('//objectid/text()', j.myXmlColumn))[1]::text AS objectid,
      (xpath('//parcelid/text()', j.myXmlColumn))[1]::text AS parcelid,
      (xpath('//kivapin/text()', j.myXmlColumn))[1]::text AS kivapin,
      (xpath('//subdivision/text()', j.myXmlColumn))[1]::text AS subdivision,
      (xpath('//block/text()', j.myXmlColumn))[1]::text AS block,
      (xpath('//lot/text()', j.myXmlColumn))[1]::text AS lot,
      (xpath('//datecreated/text()', j.myXmlColumn))[1]::text AS datecreated,
      (xpath('//landusecode/text()', j.myXmlColumn))[1]::text AS landusecode,
      (xpath('//apn/text()', j.myXmlColumn))[1]::text AS apn,
      (xpath('//parceltype/text()', j.myXmlColumn))[1]::text AS parceltype,
      (xpath('//status/text()', j.myXmlColumn))[1]::text AS status,
      (xpath('//condo/text()', j.myXmlColumn))[1]::text AS condo,
      (xpath('//platname/text()', j.myXmlColumn))[1]::text AS platname,
      (xpath('//fraction/text()', j.myXmlColumn))[1]::text AS fraction,
      (xpath('//prefix/text()', j.myXmlColumn))[1]::text AS prefix,
      (xpath('//suite/text()', j.myXmlColumn))[1]::text AS suite,
      (xpath('//own_name/text()', j.myXmlColumn))[1]::text AS own_name,
      (xpath('//own_addr/text()', j.myXmlColumn))[1]::text AS own_addr,
      (xpath('//own_city/text()', j.myXmlColumn))[1]::text AS own_city,
      (xpath('//own_zip/text()', j.myXmlColumn))[1]::text AS own_zip,
      (xpath('//blvdfront/text()', j.myXmlColumn))[1]::text AS blvdfront,
      (xpath('//lastupdate/text()', j.myXmlColumn))[1]::text AS lastupdate,
      (xpath('//shape_length/text()', j.myXmlColumn))[1]::text AS shape_length,
      (xpath('//shape_area/text()', j.myXmlColumn))[1]::text AS shape_area,
      (xpath('//latitude/text()', j.myXmlColumn))[1]::text AS latitude,
      (xpath('//longitude/text()', j.myXmlColumn))[1]::text AS longitude,
      (xpath('//location1/text()', j.myXmlColumn))[1]::text AS location1,
      j.myXmlColumn as myXmlElement
    FROM j
在处理大量数据时,s并不总是我的首选,但它确实使代码更具可读性,并且在处理数据导入时值得考虑

关于将XML文件导入PostgreSQL,我总是使用一个中间表来存储XML文档,然后再取消对它的测试

类似于所描述的

$ psql db -c "CREATE TABLE tmp (doc XML);"
$ cat xmlfile.xml | psql db -c "COPY tmp FROM STDIN"
如果PostgreSQL抱怨您的数据有换行符
\n
,您可以使用诸如
sed
tr
或甚至使用
perl-pe
之类的工具对其进行排序:

$ cat xmlfile.xml | perl -pe 's/\n/\\n/g' | psql db -c "COPY tmp FROM STDIN"
顺便说一句:在这个xpath表达式之后的查询中缺少一个逗号
(xpath('//location1/text()'mytestable.myXmlColumn))[1]::文本作为位置1,

编辑:如果您可以将文件直接放入数据库服务器的文件系统(我们大多数人不这样做),您可以通过继续使用
pg\u read\u binary\u file
convert\u from
的组合,但请记住表达式
//row/
会导致未知类型,在函数中用作参数时可能会遇到困难。相反,请使用简单的表达式来完成此工作:

SELECT
...
FROM UNNEST(XPATH(
  '//row'
  ,XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('parcel_data_first_row.xml'), 'UTF8')))
) AS myTempTable(myXmlColumn);

这很奇怪。在9.5中,unknown应该单独强制转换为文本,但似乎没有发生-尝试显式强制转换,关于逗号的这一点很有趣。当我将其添加回时,我得到以下错误:
[2018-03-27 19:29:10][42804]错误:无法确定多态类型,因为输入具有类型“unknown”
。我将使用您导入文件的方法,使用解决方案编辑我的答案
$ cat xmlfile.xml | perl -pe 's/\n/\\n/g' | psql db -c "COPY tmp FROM STDIN"
SELECT
...
FROM UNNEST(XPATH(
  '//row'
  ,XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('parcel_data_first_row.xml'), 'UTF8')))
) AS myTempTable(myXmlColumn);