<img src="//i.stack.imgur.com/RUiNP.png" height="16" width="18" alt="" class="sponsor tag img">elasticsearch 对数存储输出性能_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Logstash_Bigdata

elasticsearch 对数存储输出性能

logstash

elasticsearch 对数存储输出性能,elasticsearch,logstash,bigdata,elasticsearch,Logstash,Bigdata,我使用的是Elasticsearch 5.1.1、logstash 5.1.1，我在2小时内通过logstash将300万行从sqlserver导入elastic input { jdbc { jdbc_driver_library => "D:\Usefull_Jars\sqljdbc4-4.0.jar" jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver" jdbc_connection_strin

我使用的是Elasticsearch 5.1.1、logstash 5.1.1，我在2小时内通过logstash将300万行从sqlserver导入elastic

input { jdbc { jdbc_driver_library => "D:\Usefull_Jars\sqljdbc4-4.0.jar" jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver" jdbc_connection_string => "jdbc:sqlserver://192.168.5.14:1433;databaseName=DataSource;integratedSecurity=false;user=****;password=****;" jdbc_user => "****" jdbc_password => "****" statement => "SELECT * FROM RawData" jdbc_fetch_size => 1000 } } output { elasticsearch { hosts => "localhost" index => "testdata" document_type => "testfeed" document_id => "%{id}" flush_size => 512 } }
我有一台带有4GB内存的单windows计算机，core I 3）：是否应该添加任何其他配置以加快导入
我试图通过更改logstash.yml设置
但这并不影响
日志存储配置

input { jdbc { jdbc_driver_library => "D:\Usefull_Jars\sqljdbc4-4.0.jar" jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver" jdbc_connection_string => "jdbc:sqlserver://192.168.5.14:1433;databaseName=DataSource;integratedSecurity=false;user=****;password=****;" jdbc_user => "****" jdbc_password => "****" statement => "SELECT * FROM RawData" jdbc_fetch_size => 1000 } } output { elasticsearch { hosts => "localhost" index => "testdata" document_type => "testfeed" document_id => "%{id}" flush_size => 512 } }
logstash.yml

pipeline: batch: size: 125 delay: 2 # # Or as flat keys: # pipeline.batch.size: 125 # pipeline.batch.delay: 5 # ------------ Pipeline Settings -------------- # Set the number of workers that will, in parallel, execute the filters+outputs # stage of the pipeline. # This defaults to the number of the host's CPU cores. pipeline.workers: 5 # How many workers should be used per output plugin instance pipeline.output.workers: 5 # How many events to retrieve from inputs before sending to filters+workers pipeline.batch.size: 125 # How long to wait before dispatching an undersized batch to filters+workers # Value is in milliseconds. # pipeline.batch.delay: 5 # ------------ Queuing Settings -------------- # # Internal queuing model, "memory" for legacy in-memory based queuing and # "persisted" for disk-based acked queueing. Defaults is memory # # queue.type: memory # # If using queue.type: persisted, the directory path where the data files will be stored. # Default is path.data/queue # # path.queue: # # If using queue.type: persisted, the page data files size. The queue data consists of # append-only data files separated into pages. Default is 250mb # # queue.page_capacity: 250mb # # If using queue.type: persisted, the maximum number of unread events in the queue. # Default is 0 (unlimited) # # queue.max_events: 0 # # If using queue.type: persisted, the total capacity of the queue in number of bytes. # If you would like more unacked events to be buffered in Logstash, you can increase the # capacity using this setting. Please make sure your disk drive has capacity greater than # the size specified here. If both max_bytes and max_events are specified, Logstash will pick # whichever criteria is reached first # Default is 1024mb or 1gb # # queue.max_bytes: 1024mb # # If using queue.type: persisted, the maximum number of acked events before forcing a checkpoint # Default is 1024, 0 for unlimited # # queue.checkpoint.acks: 1024 # # If using queue.type: persisted, the maximum number of written events before forcing a checkpoint # Default is 1024, 0 for unlimited # # queue.checkpoint.writes: 1024 # # If using queue.type: persisted, the interval in milliseconds when a checkpoint is forced on the head page # Default is 1000, 0 for no periodic checkpoint. # # queue.checkpoint.interval: 1000

提前感谢……
有多少工作线程在运行logstash？？pipeline.workers:logstash.yml中有5个工作线程可以发布logstash.yml吗？我有一些建议。减少jdbc_fetch_size=>300和pipeline.workers:8和pipeline.output.workers:8，并在conf文件夹jvm.options文件中增加logstash的内存，并将-Xms256m-Xmx1g替换为-Xms512m-Xmx2g