在postgresql中导入CSV而不复制行
将CSV文件导入到表中时,我的查询重复行:在postgresql中导入CSV而不复制行,postgresql,csv,Postgresql,Csv,将CSV文件导入到表中时,我的查询重复行: +----------+----------------+---------------------+ | fullname | formattedvalue | recordTime | +----------+----------------+---------------------+ | text1 | 170.01346 | 09/02/2020 21:45:00 | +----------+--------
+----------+----------------+---------------------+
| fullname | formattedvalue | recordTime |
+----------+----------------+---------------------+
| text1 | 170.01346 | 09/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text2 | 24.153432536 | 09/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text3 | 3.583432424 | 09/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text1 | 170.01346 | 08/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text2 | 24.153432536 | 08/02/2020 21:45:00 |
+----------+----------------+---------------------+
| text3 | 3.583432424 | 08/02/2020 21:45:00 |
+----------+----------------+---------------------+
以及查询:
CREATE TEMP TABLE tmp_x
(
"fullname" varchar,
"formattedvalue" double precision,
"recordtime" timestamp
);
COPY tmp_x FROM PROGRAM 'more +1 "D:\MEAS_20200308x.csv"' (FORMAT csv, DELIMITER ',');
--UPDATE tmp_x
--SET formattedvalue = ROUND( CAST(formattedvalue as numeric), 3 );
insert into meas_kanal select * from (
select x.*
from tmp_x x
left outer join meas_kanal t on t.fullname = x. fullname AND t. recordtime = x. recordtime
where t. fullname is null AND t. recordtime is null
) as missing;
DROP TABLE tmp_x;
我的逻辑是检查cloumn组合中的重复项:fullname
+recordtime
当我再次启动查询时,它再次插入相同的行
知道我错在哪里吗
编辑2:
我试着用同样的问题来解决这个问题:
INSERT INTO meas_kanal
SELECT x.*
FROM tmp_x x
LEFT OUTER JOIN meas_kanal t ON (t. fullname = x. fullname AND t. recordtime = x. recordtime)
WHERE t.fullname IS NULL AND t. recordtime IS NULL;
编辑3:
还有一次失败
INSERT INTO meas_kanal
SELECT *
FROM tmp_x
WHERE NOT EXISTS(SELECT *
FROM meas_kanal
WHERE (tmp_x.fullname=meas_kanal.fullname and
tmp_x.recordtime=meas_kanal.recordtime)
);
我认为问题出在别的地方
编辑4:可能的解决方案
顺便说一句,我忘了提。我没有主键。
现在我提出两点:
CREATE TEMP TABLE tmp_x
(
"fullname" varchar,
"formattedvalue" double precision,
"recordtime" timestamp,
UNIQUE (fullname, recordtime)
);
CREATE TEMP TABLE tmp_x
(
"fullname" varchar,
"formattedvalue" double precision,
"recordtime" timestamp,
UNIQUE (fullname, recordtime)
);
并在此插入:
insert into meas_kanal(fullname, formattedvalue,recordtime)
SELECT fullname, formattedvalue,recordtime FROM tmp_x x
ON CONFLICT DO NOTHING;
insert into meas_kanal(fullname, formattedvalue,recordtime)
SELECT fullname, formattedvalue,recordtime FROM tmp_x x
ON CONFLICT DO NOTHING;
现在它的工作方式和我预期的一样。如果没有人给出更好的解决方案,我将写下此解决方案作为答案。您可以使用
分组方式
或独特方式
假设当前查询结果与显示的表一致
INSERT INTO meas_kanal
SELECT DISTINCT x.fullname, x.formattedvalue, x.recordtime
FROM tmp_x x
更新1 根据您最近的评论,如果希望插入最新记录时间,请使用
GROUP BY
INSERT INTO meas_kanal
SELECT x.fullname, x.formattedvalue, MAX(x.recordtime) recordtime
FROM tmp_x x
GROUP BY x.fullname, x.formattedvalue
我有两个主键: 并在此插入:
insert into meas_kanal(fullname, formattedvalue,recordtime)
SELECT fullname, formattedvalue,recordtime FROM tmp_x x
ON CONFLICT DO NOTHING;
insert into meas_kanal(fullname, formattedvalue,recordtime)
SELECT fullname, formattedvalue,recordtime FROM tmp_x x
ON CONFLICT DO NOTHING;
现在它的工作方式和我预期的一样。你应该使用
分组方式@SILENT你能举个例子吗?我的答案有用吗?@SILENT不,没有。仍在进行双重记录。您的问题是,在任何查询之前,原始表的外观如何?你能告诉我tmp_x中有哪些列吗?我假设有一个id列,对吗?