Java 在客户端验证行优于使用整个主键的二级索引?
在cassandra中,二级索引的使用应该非常少,这是众所周知的 例如,如果我有一张桌子:Java 在客户端验证行优于使用整个主键的二级索引?,java,cassandra,secondary-indexes,Java,Cassandra,Secondary Indexes,在cassandra中,二级索引的使用应该非常少,这是众所周知的 例如,如果我有一张桌子: User(username, usertype, email, etc..) 这里的用户名是分区键。现在我想支持一个操作,当且仅当usertype是一个特定的值X时,该操作返回一个特定的用户(将给出用户名) 我有两种方法可以做到这一点: 一: 在usertype和可能的值('a','B','C')上创建二级索引 用户名是分区键 SELECT * FROM user WHERE username='som
User(username, usertype, email, etc..)
这里的用户名是分区键。现在我想支持一个操作,当且仅当usertype是一个特定的值X时,该操作返回一个特定的用户(将给出用户名)
我有两种方法可以做到这一点:
一:
在usertype和可能的值('a','B','C')上创建二级索引
用户名是分区键
SELECT * FROM user WHERE username='something' AND usertype='A';
二:
我可以将带有username的行提取到客户端,然后检查usertype是否为A
<强>哪种方法更好?< /强>请考虑一个宽行(不太大,10s)的场景,其中分区的所有行可能没有给定的值(这需要一些客户端过滤)。 关于二级索引,我不清楚的是如何在特定节点中查找数据
例如:SELECT*FROM user,其中username='something'和usertype='A'
例如,usertype hidden CF有数据“A”->“jhon”、“miller”、“chris”等100个用户名
带分区键的查询与usertype一起给出它是否扫描所有这100个用户名以与用户名'something'匹配,或者它是否只是首先按用户名获取并查看usertype列(如果它与'A'匹配)它是如何进行搜索的?如果索引基于低基数数据,并且每个数据都映射到多行,那么查询的结果如何?
如果这很重要的话,我将使用java作为客户端
更新:
我知道我可以在这个特定的例子中使用集群(usertype)键,但我想知道我所要求的权衡。我的原始表要复杂得多。这里最好的选择是创建一个由username和usertype组成的复合主键,其中username是分区键,usertype是集群键。您甚至不需要索引,查询也可以工作
CREATE TABLE users (
username text,
usertype text,
....
PRIMARY KEY ((username), usertype)
)
在本例中,假设我创建了一个表来按船舶和id跟踪船员:
CREATE TABLE crewByShip (
ship text,
id int,
firstname text,
lastname text,
gender text,
PRIMARY KEY(ship,id));
我将创建一个关于性别的索引:
CREATE INDEX crewByShipG_idx ON crewByShip(gender);
插入一些数据后,我的表如下所示:
ship | id | firstname | gender | lastname
----------+----+-----------+--------+-----------
Serenity | 1 | Hoban | M | Washburne
Serenity | 2 | Zoey | F | Washburne
Serenity | 3 | Malcolm | M | Reynolds
Serenity | 4 | Kaylee | F | Frye
Serenity | 5 | Sheppard | M | Book
Serenity | 6 | Jayne | M | Cobb
Serenity | 7 | Simon | M | Tam
Serenity | 8 | River | F | Tam
Serenity | 9 | Inara | F | Serra
现在我将打开跟踪,并使用主键查询一个不同的行,但也受gender
上的索引限制
aploetz@cqlsh:stackoverflow2> tracing on;
aploetz@cqlsh:stackoverflow2> SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3 AND gender='M';
ship | id | firstname | gender | lastname
----------+----+-----------+--------+----------
Serenity | 3 | Malcolm | M | Reynolds
(1 rows)
Tracing session: 34ea1840-e8e1-11e4-9cb7-21b264d4c94d
activity | timestamp | source | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+----------------+----------------
Execute CQL3 query | 2015-04-22 06:17:48.102000 | 192.168.23.129 | 0
Parsing SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3 AND gender='M'; [SharedPool-Worker-1] | 2015-04-22 06:17:48.114000 | 192.168.23.129 | 3715
Preparing statement [SharedPool-Worker-1] | 2015-04-22 06:17:48.116000 | 192.168.23.129 | 4846
Executing single-partition query on users [SharedPool-Worker-2] | 2015-04-22 06:17:48.118000 | 192.168.23.129 | 5730
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:48.118000 | 192.168.23.129 | 5757
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.119000 | 192.168.23.129 | 5793
Key cache hit for sstable 1 [SharedPool-Worker-2] | 2015-04-22 06:17:48.119000 | 192.168.23.129 | 5848
Seeking to partition beginning in data file [SharedPool-Worker-2] | 2015-04-22 06:17:48.120000 | 192.168.23.129 | 5856
Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.120000 | 192.168.23.129 | 7056
Merging data from memtables and 1 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:48.121000 | 192.168.23.129 | 7080
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7143
Computing ranges to query [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7578
Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=gender, type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, componentIndex=1, indexName=crewbyshipg_idx, indexType=COMPOSITES}]}:0. Scanning with crewbyship.crewbyshipg_idx. [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7742
Submitting range requests on 1 ranges with a concurrency of 1 (0.0 rows per range expected) [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7807
Submitted 1 concurrent range requests covering 1 ranges [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7851
Executing indexed scan for [Serenity, Serenity] [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 10848
Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=gender, type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, componentIndex=1, indexName=crewbyshipg_idx, indexType=COMPOSITES}]}:0. Scanning with crewbyship.crewbyshipg_idx. [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 10936
Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=gender, type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, componentIndex=1, indexName=crewbyshipg_idx, indexType=COMPOSITES}]}:0. Scanning with crewbyship.crewbyshipg_idx. [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 11007
Executing single-partition query on crewbyship.crewbyshipg_idx [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 11130
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 11139
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.124000 | 192.168.23.129 | 11155
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.124000 | 192.168.23.129 | 11253
Merging data from memtables and 0 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:48.124000 | 192.168.23.129 | 11262
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:48.127000 | 192.168.23.129 | 11281
Executing single-partition query on crewbyship [SharedPool-Worker-2] | 2015-04-22 06:17:48.130000 | 192.168.23.129 | 11369
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:48.131000 | 192.168.23.129 | 11375
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.131000 | 192.168.23.129 | 11383
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.133000 | 192.168.23.129 | 11409
Merging data from memtables and 0 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:48.134000 | 192.168.23.129 | 11415
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:48.138000 | 192.168.23.129 | 11430
Scanned 1 rows and matched 1 [SharedPool-Worker-2] | 2015-04-22 06:17:48.138000 | 192.168.23.129 | 11490
Request complete | 2015-04-22 06:17:48.115679 | 192.168.23.129 | 13679
aploetz@cqlsh:stackoverflow2> SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3;
ship | id | firstname | gender | lastname
----------+----+-----------+--------+----------
Serenity | 3 | Malcolm | M | Reynolds
(1 rows)
Tracing session: 38d7f440-e8e1-11e4-9cb7-21b264d4c94d
activity | timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------------+----------------------------+----------------+----------------
Execute CQL3 query | 2015-04-22 06:17:54.692000 | 192.168.23.129 | 0
Parsing SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3; [SharedPool-Worker-1] | 2015-04-22 06:17:54.695000 | 192.168.23.129 | 87
Preparing statement [SharedPool-Worker-1] | 2015-04-22 06:17:54.696000 | 192.168.23.129 | 246
Executing single-partition query on users [SharedPool-Worker-3] | 2015-04-22 06:17:54.697000 | 192.168.23.129 | 1185
Acquiring sstable references [SharedPool-Worker-3] | 2015-04-22 06:17:54.698000 | 192.168.23.129 | 1197
Merging memtable tombstones [SharedPool-Worker-3] | 2015-04-22 06:17:54.698000 | 192.168.23.129 | 1215
Key cache hit for sstable 1 [SharedPool-Worker-3] | 2015-04-22 06:17:54.700000 | 192.168.23.129 | 1249
Seeking to partition beginning in data file [SharedPool-Worker-3] | 2015-04-22 06:17:54.700000 | 192.168.23.129 | 1278
Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-04-22 06:17:54.701000 | 192.168.23.129 | 3309
Merging data from memtables and 1 sstables [SharedPool-Worker-3] | 2015-04-22 06:17:54.701000 | 192.168.23.129 | 3333
Read 1 live and 0 tombstoned cells [SharedPool-Worker-3] | 2015-04-22 06:17:54.702000 | 192.168.23.129 | 3368
Executing single-partition query on crewbyship [SharedPool-Worker-2] | 2015-04-22 06:17:54.702000 | 192.168.23.129 | 4607
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:54.704000 | 192.168.23.129 | 4633
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:54.704000 | 192.168.23.129 | 4643
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:54.705000 | 192.168.23.129 | 4678
Merging data from memtables and 0 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:54.705000 | 192.168.23.129 | 4683
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:54.706000 | 192.168.23.129 | 4697
Request complete | 2015-04-22 06:17:54.697676 | 192.168.23.129 | 5676
现在,我将重新运行相同的查询,但不在gender
上添加多余的索引
aploetz@cqlsh:stackoverflow2> tracing on;
aploetz@cqlsh:stackoverflow2> SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3 AND gender='M';
ship | id | firstname | gender | lastname
----------+----+-----------+--------+----------
Serenity | 3 | Malcolm | M | Reynolds
(1 rows)
Tracing session: 34ea1840-e8e1-11e4-9cb7-21b264d4c94d
activity | timestamp | source | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+----------------+----------------
Execute CQL3 query | 2015-04-22 06:17:48.102000 | 192.168.23.129 | 0
Parsing SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3 AND gender='M'; [SharedPool-Worker-1] | 2015-04-22 06:17:48.114000 | 192.168.23.129 | 3715
Preparing statement [SharedPool-Worker-1] | 2015-04-22 06:17:48.116000 | 192.168.23.129 | 4846
Executing single-partition query on users [SharedPool-Worker-2] | 2015-04-22 06:17:48.118000 | 192.168.23.129 | 5730
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:48.118000 | 192.168.23.129 | 5757
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.119000 | 192.168.23.129 | 5793
Key cache hit for sstable 1 [SharedPool-Worker-2] | 2015-04-22 06:17:48.119000 | 192.168.23.129 | 5848
Seeking to partition beginning in data file [SharedPool-Worker-2] | 2015-04-22 06:17:48.120000 | 192.168.23.129 | 5856
Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.120000 | 192.168.23.129 | 7056
Merging data from memtables and 1 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:48.121000 | 192.168.23.129 | 7080
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7143
Computing ranges to query [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7578
Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=gender, type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, componentIndex=1, indexName=crewbyshipg_idx, indexType=COMPOSITES}]}:0. Scanning with crewbyship.crewbyshipg_idx. [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7742
Submitting range requests on 1 ranges with a concurrency of 1 (0.0 rows per range expected) [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7807
Submitted 1 concurrent range requests covering 1 ranges [SharedPool-Worker-1] | 2015-04-22 06:17:48.122000 | 192.168.23.129 | 7851
Executing indexed scan for [Serenity, Serenity] [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 10848
Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=gender, type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, componentIndex=1, indexName=crewbyshipg_idx, indexType=COMPOSITES}]}:0. Scanning with crewbyship.crewbyshipg_idx. [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 10936
Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=gender, type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, componentIndex=1, indexName=crewbyshipg_idx, indexType=COMPOSITES}]}:0. Scanning with crewbyship.crewbyshipg_idx. [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 11007
Executing single-partition query on crewbyship.crewbyshipg_idx [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 11130
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:48.123000 | 192.168.23.129 | 11139
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.124000 | 192.168.23.129 | 11155
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.124000 | 192.168.23.129 | 11253
Merging data from memtables and 0 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:48.124000 | 192.168.23.129 | 11262
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:48.127000 | 192.168.23.129 | 11281
Executing single-partition query on crewbyship [SharedPool-Worker-2] | 2015-04-22 06:17:48.130000 | 192.168.23.129 | 11369
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:48.131000 | 192.168.23.129 | 11375
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.131000 | 192.168.23.129 | 11383
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:48.133000 | 192.168.23.129 | 11409
Merging data from memtables and 0 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:48.134000 | 192.168.23.129 | 11415
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:48.138000 | 192.168.23.129 | 11430
Scanned 1 rows and matched 1 [SharedPool-Worker-2] | 2015-04-22 06:17:48.138000 | 192.168.23.129 | 11490
Request complete | 2015-04-22 06:17:48.115679 | 192.168.23.129 | 13679
aploetz@cqlsh:stackoverflow2> SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3;
ship | id | firstname | gender | lastname
----------+----+-----------+--------+----------
Serenity | 3 | Malcolm | M | Reynolds
(1 rows)
Tracing session: 38d7f440-e8e1-11e4-9cb7-21b264d4c94d
activity | timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------------+----------------------------+----------------+----------------
Execute CQL3 query | 2015-04-22 06:17:54.692000 | 192.168.23.129 | 0
Parsing SELECT * FROM crewByShip WHERE ship='Serenity' AND id=3; [SharedPool-Worker-1] | 2015-04-22 06:17:54.695000 | 192.168.23.129 | 87
Preparing statement [SharedPool-Worker-1] | 2015-04-22 06:17:54.696000 | 192.168.23.129 | 246
Executing single-partition query on users [SharedPool-Worker-3] | 2015-04-22 06:17:54.697000 | 192.168.23.129 | 1185
Acquiring sstable references [SharedPool-Worker-3] | 2015-04-22 06:17:54.698000 | 192.168.23.129 | 1197
Merging memtable tombstones [SharedPool-Worker-3] | 2015-04-22 06:17:54.698000 | 192.168.23.129 | 1215
Key cache hit for sstable 1 [SharedPool-Worker-3] | 2015-04-22 06:17:54.700000 | 192.168.23.129 | 1249
Seeking to partition beginning in data file [SharedPool-Worker-3] | 2015-04-22 06:17:54.700000 | 192.168.23.129 | 1278
Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-04-22 06:17:54.701000 | 192.168.23.129 | 3309
Merging data from memtables and 1 sstables [SharedPool-Worker-3] | 2015-04-22 06:17:54.701000 | 192.168.23.129 | 3333
Read 1 live and 0 tombstoned cells [SharedPool-Worker-3] | 2015-04-22 06:17:54.702000 | 192.168.23.129 | 3368
Executing single-partition query on crewbyship [SharedPool-Worker-2] | 2015-04-22 06:17:54.702000 | 192.168.23.129 | 4607
Acquiring sstable references [SharedPool-Worker-2] | 2015-04-22 06:17:54.704000 | 192.168.23.129 | 4633
Merging memtable tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:54.704000 | 192.168.23.129 | 4643
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-04-22 06:17:54.705000 | 192.168.23.129 | 4678
Merging data from memtables and 0 sstables [SharedPool-Worker-2] | 2015-04-22 06:17:54.705000 | 192.168.23.129 | 4683
Read 1 live and 0 tombstoned cells [SharedPool-Worker-2] | 2015-04-22 06:17:54.706000 | 192.168.23.129 | 4697
Request complete | 2015-04-22 06:17:54.697676 | 192.168.23.129 | 5676
如您所见,具有二级索引的查询的“source_appeased”是没有索引的同一查询(返回同一行)的两倍多
我想我们可以肯定地说,在宽行表中的低基数列上使用二级索引将表现不佳。现在,虽然我不会说过滤客户端是个好主意,在本例中,的结果集很小,这可能是更好的选择。我明白……我想我给出了一个不好的例子……但我原来的表要复杂得多……所以请尝试回答这个权衡问题……性别示例可能比用户类型更好。请看以下内容:@jny感谢链接……我仍然对低基数索引的可伸缩性表示怀疑。例如,性别,当有数百万用户时,甚至当我们在查询中提供分区键时。是的..唯一的选择是创建另一个具有(ship,id,gender)权限的表?但这将导致我无法使用记录的批处理一致地更新两个表…这会在插入和复制数据时损失一些性能…对吗?@pinkparter在两个帐户上都正确。但这是卡桑德拉最擅长的使用模式。如果可以异步启动批处理,应用程序就不会注意到太多性能问题。记录的批处理可以异步运行吗?@pinkparter是的。柳本贴出了一个很好的答案,描述道:我错过了什么吗?但那个答案是关于使用批处理和异步查询的,我已经知道了……这不是关于异步执行记录的批处理,对吗?