迭代SQLite查询的结果作为后续查询的输入
我有一个SQLite表,其中包含以下字段,表示从存储在磁盘上的各个文件中提取的元数据。每个文件都有一个记录:迭代SQLite查询的结果作为后续查询的输入,sql,sqlite,loops,count,subquery,Sql,Sqlite,Loops,Count,Subquery,我有一个SQLite表,其中包含以下字段,表示从存储在磁盘上的各个文件中提取的元数据。每个文件都有一个记录: __path denotes the full path and filename (in effect the PK) __dirpath denotes the directory path excluding the filename __dirname denotes the directory name in which the file is found r
__path denotes the full path and filename (in effect the PK)
__dirpath denotes the directory path excluding the filename
__dirname denotes the directory name in which the file is found
refid denotes an attribute of interest, pulled from the underlying file on disk
SELECT __dirpath
FROM (
SELECT DISTINCT __dirpath,
__dirname,
refid
FROM source
)
GROUP BY __dirpath
HAVING count( * ) > 1
ORDER BY __dirpath, __dirname;
是否可以迭代查询的结果,并将每个结果用作另一个查询的输入,而不必在SQLite旁边使用类似Python的东西?例如,要查看属于失败集的记录,请执行以下操作:
SELECT __dirpath, refid
FROM source
WHERE __dirpath = <nth result from aforementioned query>;
SELECT\uuuu dirpath,refid
来源
其中_dirpath=;
如果需要所有有问题的行,一个选项是:
select t.*
from (
select t.*,
min(refid) over(partition by __dirpath, __dirname) as min_refid,
max(refid) over(partition by __dirpath, __dirname) as max_refid
from mytable t
) t
where min_refid <> max_refid
谢谢@GMB。我尝试了两种方法,第一种方法执行速度相对较快,但忽略refid为null的记录。第二次运行,但是它对大约600k记录的运行速度非常慢-在uuu dirpath、uu dirname、refid上建立索引解决了这个问题。真正的表有更多的字段,所以我将t.*替换为我要查找的特定字段。由于refid不一致可能由null或alternate值引起,因此第二个查询生成了正确的结果,列出了所有受影响的记录。
select t.*
from mytable t
where exists (
select 1
from mytable t1
where
t1.__dirpath = t.__dirpath
and t1.__dirname = t.__dirname
and t1.ref_id is not t.ref_id
)