Performance mongodb分片是如何工作的？如何通过分片提高插入速度？_Performance_Mongodb_Insert_Sharding

Performance mongodb分片是如何工作的？如何通过分片提高插入速度？

performance mongodb

Performance mongodb分片是如何工作的？如何通过分片提高插入速度？,performance,mongodb,insert,sharding,Performance,Mongodb,Insert,Sharding,我想测试MongoDB的插入速度，有4个分片，3个配置，4个mongos，chunksize 64M，当我插入100个double[100000]数据时，它可以自动分片，但是插入速度没有提高 (1) I create a database,create a collection "docs",and insert 100 double[100000],it takes 30S (2) I drop the "docs",create a new collection "docs",enables

我想测试MongoDB的插入速度，有4个分片，3个配置，4个mongos，chunksize 64M，当我插入100个double[100000]数据时，它可以自动分片，但是插入速度没有提高

(1) I create a database,create a collection "docs",and insert 100 double[100000],it takes 30S
(2) I drop the "docs",create a new collection "docs",enablesharding ,insert a {name:"hashed"},it takes 30S or more.

每个分片几乎都有相同的数据，或者区块的编号，我已经更改了区块大小5MB、20MB、100MB、200MB，但无法使速度降低3/4

分片减少了每个分片处理的操作数，那么如何通过添加分片来减少插入时间、提高插入速度呢？或者我的测试数据是错误的，它太小而无法显示mongodb的性能？我停止平衡器（），sh.stopBalancer（），sh.status（）

每个分片都有数据，这意味着mongodb已经通过分片键均匀分布了？但是为什么插入速度没有降低，是否有错误？您是否有相同的情况或成功地缩短了时间

我使用多线程成功地减少了时间。

这里有两种可能的情况：
1.插入平均分布在所有碎片上，在这种情况下，随着每个碎片的添加，读写性能将线性提高。还可以增加mongos（路由器）的数量。
2.插入只关注一个或一个子集的碎片，在这种情况下，添加碎片无助于提高性能。这可能表明shardKey具有较低的基数或随机性因子。查看此链接：

由于您没有向我们提供足够的数据（关于使用的shardKey和影响所有Shard的插入），因此您需要推断上述两种情况中的哪一种阻碍了写性能的提高

希望这有帮助。

您无法通过切分“显著”提高插入速度。如果您希望在复制集之间几乎均匀地分布插入操作，那么在插入过程中要做出的决策太多了。实际上，通过切分，您手中要处理的操作比插入单个实例要多

若你们想要一个真正的速度，你们可以冒一些稳定性的风险，你们最好的选择是打开写确认文档，使用fire&forget插入

关键是将写操作分布到所有碎片上。如果在每个给定时刻只插入一个分片，那么不管您总共有多少个分片，都不会有多大帮助。我使用mongos插入数据，它不会自动分发数据？但是sh.status（）显示每个分片几乎具有相同的chunksize或datal。我们如何插入4个（与分片相同）每时每刻都有数据？我关闭平衡器sh.stopBalancer（），然后插入数据。停止平衡器将使问题更加复杂，因为块迁移将停止。我担心我们需要更多的数据才能得出具体的结论。

mongos> sh.status()
--- Sharding Status --- 
 sharding version: {
    "_id" : 1,
    "version" : 4,
    "minCompatibleVersion" : 4,
    "currentVersion" : 5,
    "clusterId" : ObjectId("5450ed56eb3978383f81a863")
}
  shards:
    {  "_id" : "s1",  "host" : "192.168.137.101:27017" }
    {  "_id" : "s2",  "host" : "192.168.137.102:27018" }
    {  "_id" : "s3",  "host" : "192.168.137.103:27019" }
    {  "_id" : "s4",  "host" : "192.168.137.104:27020" }
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "liu",  "partitioned" : true,  "primary" : "s2" }
        liu.docs
            shard key: { "name" : "hashed" }
            chunks:
                s1  4
                s2  7
                s3  6
                s4  5
            too many chunks to print, use verbose if you want to force print
    {  "_id" : "test",  "partitioned" : false,  "primary" : "s1" }