Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/performance/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Performance Apache Kudu插入速度慢,排队时间长_Performance_Apache Spark_Apache Kudu_Data Ingestion - Fatal编程技术网

Performance Apache Kudu插入速度慢,排队时间长

Performance Apache Kudu插入速度慢,排队时间长,performance,apache-spark,apache-kudu,data-ingestion,Performance,Apache Spark,Apache Kudu,Data Ingestion,我一直在使用Spark数据源从Parquet写入Kudu,写入性能非常糟糕:大约12000行/秒。每行大约160字节 我们有7个kudu节点,每个节点24核+64 GB RAM,每个节点12个SATA磁盘。所有资源似乎都不是瓶颈:服务器cpu使用率~3-4核,RAM 10G,没有磁盘拥塞 但我仍然看到,大多数时候写请求都卡在队列中。欢迎提出任何意见 W0811 12:34:03.526340 7753 rpcz_store.cc:251] Call kudu.tserver.TabletSer

我一直在使用Spark数据源从Parquet写入Kudu,写入性能非常糟糕:大约12000行/秒。每行大约160字节

我们有7个kudu节点,每个节点24核+64 GB RAM,每个节点12个SATA磁盘。所有资源似乎都不是瓶颈:服务器cpu使用率~3-4核,RAM 10G,没有磁盘拥塞

但我仍然看到,大多数时候写请求都卡在队列中。欢迎提出任何意见

W0811 12:34:03.526340  7753 rpcz_store.cc:251] Call kudu.tserver.TabletServerService.Write from 10.60.170.18:10000 (ReqId={client: 81ae6f3c6e1b4d9493ea95f87ccd1dfa, seq_no=9365, attempt_no=1}) took 13255ms (client timeout 10000).
W0811 12:34:03.526489  7753 rpcz_store.cc:255] Trace:
0811 12:33:50.270477 (+     0us) service_pool.cc:163] Inserting onto call queue
0811 12:33:50.270497 (+    20us) service_pool.cc:222] Handling call
0811 12:34:03.526316 (+13255819us) inbound_call.cc:157] Queueing success response
Related trace 'txn':
0811 12:34:03.328337 (+     0us) write_transaction.cc:101] PREPARE: Starting
0811 12:34:03.328563 (+   226us) write_transaction.cc:268] Acquiring schema lock in shared mode
0811 12:34:03.328564 (+     1us) write_transaction.cc:271] Acquired schema lock
0811 12:34:03.328564 (+     0us) tablet.cc:400] PREPARE: Decoding operations
0811 12:34:03.328742 (+   178us) tablet.cc:422] PREPARE: Acquiring locks for 24 operations
0811 12:34:03.447163 (+118421us) lock_manager.cc:377] Waited 118408us for lock on <redacted>
0811 12:34:03.447203 (+    40us) tablet.cc:426] PREPARE: locks acquired
0811 12:34:03.447203 (+     0us) write_transaction.cc:126] PREPARE: finished.
0811 12:34:03.447361 (+   158us) write_transaction.cc:136] Start()
0811 12:34:03.447366 (+     5us) write_transaction.cc:141] Timestamp: P: 1533965643563964 usec, L: 6
0811 12:34:03.447674 (+   308us) log.cc:582] Serialized 64909 byte log entry
0811 12:34:03.449561 (+  1887us) write_transaction.cc:149] APPLY: Starting
0811 12:34:03.526238 (+ 76677us) tablet_metrics.cc:365] ProbeStats: bloom_lookups=48,key_file_lookups=48,delta_file_lookups=24,mrs_lookups=0
0811 12:34:03.526260 (+    22us) log.cc:582] Serialized 237 byte log entry
0811 12:34:03.526268 (+     8us) write_transaction.cc:309] Releasing row and schema locks
0811 12:34:03.526280 (+    12us) write_transaction.cc:277] Released schema lock
0811 12:34:03.526300 (+    20us) write_transaction.cc:196] FINISH: updating metrics
Metrics: {"child_traces":[["txn",{"apply.queue_time_us":11,"cfile_cache_hit":205,"cfile_cache_hit_bytes":21900627,"num_ops":24,"prepare.queue_time_us":13057291,"prepare.run_cpu_time_us":1017,"prepare.run_wall_time_us":119378,"raft.queue_time_us":71,"raft.run_cpu_time_us":303,"raft.run_wall_time_us":304,"replication_time_us":2170,"row_lock_wait_count":1,"row_lock_wait_us":118408,"spinlock_wait_cycles":45824}]]}
W0811 12:34:03.526340 7753 rpcz_store.cc:251]调用kudu.tserver.TabletServerService.Write from 10.60.170.18:10000(请求ID={客户端:81ae6f3c6e1b4d9493ea95f87ccd1dfa,序号=9365,尝试次数=1})耗时13255ms(客户端超时10000)。
W0811 12:34:03.526489 7753 rpcz_商店。抄送:255]跟踪:
0811 12:33:50.270477(+0us)服务池。抄送:163]插入呼叫队列
0811 12:33:50.270497(+20us)服务池。抄送:222]处理呼叫
0811 12:34:03.526316(+13255819us)入站_呼叫。cc:157]排队成功响应
相关跟踪“txn”:
0811 12:34:03.328337(+0us)写入_事务。cc:101]准备:开始
0811 12:34:03.328563(+226us)write_事务。cc:268]在共享模式下获取架构锁
0811 12:34:03.328564(+1us)写_事务。cc:271]获取的架构锁
0811 12:34:03.328564(+0us)平板电脑。cc:400]准备:解码操作
0811 12:34:03.328742(+178us)平板电脑。cc:422]准备:获取24次操作的锁
0811 12:34:03.447163(+118421us)锁定管理器。cc:377]等待118408us锁定
0811 12:34:03.447203(+40us)片剂。抄送:426]准备:获取锁
0811 12:34:03.447203(+0us)写入_事务。cc:126]准备:完成。
0811 12:34:03.447361(+158us)写入_事务。cc:136]Start()
0811 12:34:03.447366(+5us)写_事务。cc:141]时间戳:P:1533965643563964 usec,L:6
0811 12:34:03.447674(+308us)log.cc:582]序列化的64909字节日志项
0811 12:34:03.449561(+1887us)写_事务。cc:149]应用:开始
0811 12:34:03.526238(+76677us)tablet\u metrics.cc:365]问题状态:bloom\u lookups=48,key\u file\u lookups=48,delta\u file\u lookups=24,mrs\u lookups=0
0811 12:34:03.526260(+22us)log.cc:582]序列化的237字节日志项
0811 12:34:03.526268(+8us)write_事务。cc:309]释放行锁和架构锁
0811 12:34:03.526280(+12us)write_事务。cc:277]已释放架构锁
0811 12:34:03.526300(+20us)写入_事务。cc:196]完成:更新度量
指标:{“child_traces”:[[“txn”,“apply.queue_time_us”:11,“cfile_cache_hit”:205,“cfile_cache_hit_bytes”:21900627,“num_ops”:24,“prepare.queue_time_us”:13057291,“prepare.run_cpu_time_us”:1017,“prepare.run_wall_timeu___________________________:119;”:11937; 119378,“raft.queue(.timeu(ïï,“row_lock_wait_count”:1,“row_lock_wait_us”:118408,“spinlock_wait_cycles”:45824}]}

第一个挑战是将一个包含200列的2300万行表摄取到Kudu(按主键划分为4个哈希分区)中需要花费很长时间。准确地说,这花费了惊人的58分钟,转化为每秒63行。我无法相信Kudu的速度如此之慢,我们仔细检查了安装和配置文档。不幸的是,我们信任默认值,正如我在Kudu slack频道上发现的(谢谢,Will Berkeley!),有两个参数需要调整。具体而言:

memory\u limit\u hard\u bytes
控制Kudu后台程序应使用的内存总量

maintenance\u manager\u num
维护线程数,建议设置为磁盘数的1/3,用于Kudu


CDH Kudu parcel的默认设置非常糟糕-Kudu受到1Gb内存的限制,仅使用1个维护线程。我们将后者设置为4(12个驱动器/3),将前者设置为0(动态分配).CM不想接受0作为
内存限制\u硬字节
,我们必须使用CM安全阀来覆盖它。一旦完成并重新启动Kudu,我的第一个23M表在240秒内完成(~95k行/秒)-更好!从黑斑羚到黑斑羚拼花地板的CTA只花了60秒。

结果表明,这是由于我们的数据重复造成的。我们使用的字段包含大约120万行相同的值(即和空字符串)作为Kudu的主键,Kudu更新了120万次相同的密钥,每次都需要获得一个锁,因此吸收速度会随着时间的推移而下降


我们已经删除了重复的关键行,摄取速度提高到10倍。

Tks Kishore,我已经按照建议设置了维护线程(4)内存从一开始就将硬字节限制为0。我注意到刚开始时导入速度很快,但随着时间的推移它会变慢。有什么建议可能是罪魁祸首吗?你至少可以相信你复制答案的原始帖子。这是-