Kdb q-通过枚举进行数据归一化时无速度增益

Kdb q-通过枚举进行数据归一化时无速度增益,kdb,q-lang,Kdb,Q Lang,在中,即消除列表中重复的任务中,建议使用枚举查找列表中的不同值,因为遍历整数比遍历可变长度的符号更快 u:`g`ibm`intl`msft / unique list of tickers v:1000000?u / list with duplicate tickers k:u?v / positions in u \t:10 distinct v / performing distinct on symbols 10 times and timing \t:10 distinct k /

在中,即消除列表中重复的任务中,建议使用枚举查找列表中的不同值,因为遍历整数比遍历可变长度的符号更快

u:`g`ibm`intl`msft / unique list of tickers
v:1000000?u / list with duplicate tickers
k:u?v / positions in u
\t:10 distinct v / performing distinct on symbols 10 times and timing 
\t:10 distinct k / performing distinct on positions 10 times and timing 
我发现
distinct v
distinct k
快得多,这与承诺不符


感谢您的帮助。

枚举通常用于保存到磁盘的数据,以帮助进行压缩等 这就是您将看到更大性能增益的地方

KDB+ 3.5 2017.04.06 Copyright (C) 1993-2017 Kx Systems

Welcome to kdb+ 32bit edition
For support please see http://groups.google.com/d/forum/personal-kdbplus
Tutorials can be found at http://code.kx.com/wiki/Tutorials
To exit, type \\
To remove this startup msg, edit q.q
u:`g`ibm`intl`msft / unique list of tickers
v:1000000?u / list with duplicate tickers
q)k:`u$v //enumerate v against u
q)k
`u$`g`g`intl`ibm`intl`ibm`intl`msft`intl`ibm`g`msft`ibm`intl`intl`ibm`g`ibm`i..
q)save `:k
`:k
q)save `:u
`:u
q)save `:v
`:v
q)\\

KDB+ 3.5 2017.04.06 Copyright (C) 1993-2017 Kx Systems

Welcome to kdb+ 32bit edition
For support please see http://groups.google.com/d/forum/personal-kdbplus
Tutorials can be found at http://code.kx.com/wiki/Tutorials
To exit, type \\
To remove this startup msg, edit q.q
q)u:get `:u
q)\ts:10 distinct get `:v
462 8388848
q)\ts:10 distinct get `:k
37 4194544
q)
但您确实提出了一个有趣的问题,即为什么在符号列表(在mem中)上的差异比int列表上的差异更快