Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/database/8.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
网络蜘蛛速度受MySQL配置限制?_Mysql_Database_Relational Database - Fatal编程技术网

网络蜘蛛速度受MySQL配置限制?

网络蜘蛛速度受MySQL配置限制?,mysql,database,relational-database,Mysql,Database,Relational Database,最近,在我的办公网络在康卡斯特业务上受到限制后,我将我的网络蜘蛛转移到了付费云类型的托管环境。我正在经历一些奇怪的限制,希望有人能给我一些启示 我有10个小实例在运行,每个都有1 GB的ram和1个CPU,带有SSd存储 每个实例的配置基本上是每天访问100万个网站——每个实例单独运行时都可以轻松实现这一点 我有另一个实例,它正在处理一个简单的MySQL数据库以保持跟踪—这个数据库使用8GB的ram、4个内核和90GB的SSD 如果我运行4个实例,我每天可以达到400万个,如果我运行10个实例,

最近,在我的办公网络在康卡斯特业务上受到限制后,我将我的网络蜘蛛转移到了付费云类型的托管环境。我正在经历一些奇怪的限制,希望有人能给我一些启示

我有10个小实例在运行,每个都有1 GB的ram和1个CPU,带有SSd存储

每个实例的配置基本上是每天访问100万个网站——每个实例单独运行时都可以轻松实现这一点

我有另一个实例,它正在处理一个简单的MySQL数据库以保持跟踪—这个数据库使用8GB的ram、4个内核和90GB的SSD

如果我运行4个实例,我每天可以达到400万个,如果我运行10个实例,我每天仍然只能达到400万个-在这种配置中有些东西让人窒息

MySQL大部分时间运行在大约360%的CPU 4核上,大约70%的内存——典型的io大约是4MB/s

数据库写入由10个表组成,并非每次访问都会导致对每一列的写入—大约50%的访问会导致2-10个表写入。唯一始终更新的表是带有上次访问日期/时间的已访问表

以下是我的配置中的一些摘录:

my.cnf:

    explicit_defaults_for_timestamp
connect_timeout = 60
sync_binlog = 0

innodb_buffer_pool_size = 5G
innodb_file_format=Barracuda
innodb_log_file_size = 10G
innodb_file_per_table=1
innodb_log_buffer_size=4M
innodb_flush_log_at_trx_commit=0
innodb_thread_concurrency=10
#transaction-isolation=READ-COMMITTED
max_connections = 2500
innodb_buffer_pool_instances = 5
innodb_io_capacity = 30000
innodb_read_io_threads = 10000
innodb_write_io_threads = 10000

innodb_doublewrite = 0
innodb_open_files = 10000
innodb_support_xa=0

innodb_flush_method = O_DIRECT

max_allowed_packet = 32M
thread_stack = 1M
sort_buffer_size = 256K

table_open_cache = 2000
thread_cache_size = 5000
这是一个典型的表-带有索引等等-我相信这些都是相当精确的,因为我可以很快地查询/排序/等等

CREATE TABLE `websites` (
  `wid` bigint(20) NOT NULL AUTO_INCREMENT,
  `host` varchar(100) NOT NULL,
  `status` int(3) NOT NULL DEFAULT '0',
  `total_time` int(15) NOT NULL DEFAULT '0',
  `total_data` int(15) NOT NULL DEFAULT '0',
  `hash` int(15) NOT NULL DEFAULT '0',
  `machine` int(2) NOT NULL DEFAULT '0',
  `ipv4` int(4) unsigned DEFAULT NULL,
  `ipv6` binary(16) DEFAULT NULL,
  PRIMARY KEY (`wid`),
  UNIQUE KEY `host` (`host`),
  KEY `status` (`status`),
  KEY `total_time` (`total_time`),
  KEY `total_data` (`total_data`),
  KEY `machine` (`machine`),
  KEY `ipv4` (`ipv4`,`ipv6`),
  KEY `hash` (`hash`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3741662307 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4
没有一个SELECT查询是相同的

显示全球地位

Aborted_clients
6728
Aborted_connects
135
Binlog_cache_disk_use
0
Binlog_cache_use
0
Binlog_stmt_cache_disk_use
0
Binlog_stmt_cache_use
0
Bytes_received
42547122115
Bytes_sent
36810202741
Com_admin_commands
0
Com_assign_to_keycache
0
Com_alter_db
0
Com_alter_db_upgrade
0
Com_alter_event
0
Com_alter_function
0
Com_alter_procedure
0
Com_alter_server
0
Com_alter_table
0
Com_alter_tablespace
0
Com_alter_user
0
Com_analyze
0
Com_begin
0
Com_binlog
0
Com_call_procedure
0
Com_change_db
167
Com_change_master
0
Com_change_repl_filter
0
Com_check
0
Com_checksum
0
Com_commit
0
Com_create_db
0
Com_create_event
0
Com_create_function
0
Com_create_index
0
Com_create_procedure
0
Com_create_server
0
Com_create_table
0
Com_create_trigger
0
Com_create_udf
0
Com_create_user
0
Com_create_view
0
Com_dealloc_sql
0
Com_delete
0
Com_delete_multi
0
Com_do
0
Com_drop_db
0
Com_drop_event
0
Com_drop_function
0
Com_drop_index
0
Com_drop_procedure
0
Com_drop_server
0
Com_drop_table
0
Com_drop_trigger
0
Com_drop_user
0
Com_drop_view
0
Com_empty_query
0
Com_execute_sql
0
Com_explain_other
0
Com_flush
0
Com_get_diagnostics
0
Com_grant
0
Com_ha_close
0
Com_ha_open
0
Com_ha_read
0
Com_help
0
Com_insert
261004903
Com_insert_select
0
Com_install_plugin
0
Com_kill
0
Com_load
0
Com_lock_tables
0
Com_optimize
0
Com_preload_keys
0
Com_prepare_sql
0
Com_purge
0
Com_purge_before_date
0
Com_release_savepoint
0
Com_rename_table
0
Com_rename_user
0
Com_repair
0
Com_replace
14
Com_replace_select
0
Com_reset
0
Com_resignal
0
Com_revoke
0
Com_revoke_all
0
Com_rollback
0
Com_rollback_to_savepoint
0
Com_savepoint
0
Com_select
225216121
Com_set_option
12406
Com_signal
0
Com_show_binlog_events
0
Com_show_binlogs
12
Com_show_charsets
0
Com_show_collations
0
Com_show_create_db
0
Com_show_create_event
0
Com_show_create_func
0
Com_show_create_proc
0
Com_show_create_table
47
Variable_name
Value

Com_show_create_trigger
0
Com_show_databases
0
Com_show_engine_logs
0
Com_show_engine_mutex
0
Com_show_engine_status
0
Com_show_events
0
Com_show_errors
0
Com_show_fields
585
Com_show_function_code
0
Com_show_function_status
0
Com_show_grants
4
Com_show_keys
46
Com_show_master_status
9
Com_show_open_tables
0
Com_show_plugins
0
Com_show_privileges
0
Com_show_procedure_code
0
Com_show_procedure_status
0
Com_show_processlist
26
Com_show_profile
0
Com_show_profiles
0
Com_show_relaylog_events
0
Com_show_slave_hosts
0
Com_show_slave_status
9
Com_show_status
2
Com_show_storage_engines
0
Com_show_table_status
0
Com_show_tables
32
Com_show_triggers
0
Com_show_variables
6462
Com_show_warnings
0
Com_show_create_user
0
Com_slave_start
0
Com_slave_stop
0
Com_group_replication_start
0
Com_group_replication_stop
0
Com_stmt_execute
0
Com_stmt_close
0
Com_stmt_fetch
0
Com_stmt_prepare
0
Com_stmt_reset
0
Com_stmt_send_long_data
0
Com_truncate
0
Com_uninstall_plugin
0
Com_unlock_tables
0
Com_update
3768683
Com_update_multi
0
Com_xa_commit
0
Com_xa_end
0
Com_xa_prepare
0
Com_xa_recover
0
Com_xa_rollback
0
Com_xa_start
0
Com_stmt_reprepare
0
Connection_errors_accept
0
Connection_errors_internal
0
Connection_errors_max_connections
0
Connection_errors_peer_address
0
Connection_errors_select
0
Connection_errors_tcpwrap
0
Connections
3781613
Created_tmp_disk_tables
900
Created_tmp_files
153
Created_tmp_tables
1714
Delayed_errors
0
Delayed_insert_threads
0
Delayed_writes
0
Flush_commands
1
Handler_commit
489978593
Handler_delete
0
Handler_discover
0
Handler_external_lock
979991125
Handler_mrr_init
0
Handler_prepare
0
Handler_read_first
1022
Handler_read_key
340529954
Handler_read_last
14
Handler_read_next
1325647372
Handler_read_prev
240
Handler_read_rnd
111545828
Handler_read_rnd_next
3254347
Handler_rollback
9905
Handler_savepoint
0
Handler_savepoint_rollback
0
Handler_update
3768685
Handler_write
261011792
Innodb_buffer_pool_dump_status
not started
Innodb_buffer_pool_load_status
Buffer pool(s) load completed at 150906  5:57:38
Innodb_buffer_pool_resize_status
not started
Innodb_buffer_pool_pages_data
909831
Innodb_buffer_pool_bytes_data
5310021632
Innodb_buffer_pool_pages_dirty
137841
Innodb_buffer_pool_bytes_dirty
586342400
Innodb_buffer_pool_pages_flushed
156711220
Innodb_buffer_pool_pages_free
2828
Innodb_buffer_pool_pages_misc
18446744073708966597
Innodb_buffer_pool_pages_total
327640
Innodb_buffer_pool_read_ahead_rnd
0
Innodb_buffer_pool_read_ahead
115535
Innodb_buffer_pool_read_ahead_evicted
0
Variable_name
Value

Innodb_buffer_pool_read_requests
17132988745
Innodb_buffer_pool_reads
43009838
Innodb_buffer_pool_wait_free
0
Innodb_buffer_pool_write_requests
3418426697
Innodb_data_fsyncs
748775
Innodb_data_pending_fsyncs
0
Innodb_data_pending_reads
0
Innodb_data_pending_writes
0
Innodb_data_read
182409891840
Innodb_data_reads
44358403
Innodb_data_writes
17867174
Innodb_data_written
224979945984
Innodb_dblwr_pages_written
0
Innodb_dblwr_writes
0
Innodb_log_waits
0
Innodb_log_write_requests
343630408
Innodb_log_writes
133630
Innodb_os_log_fsyncs
92883
Innodb_os_log_pending_fsyncs
0
Innodb_os_log_pending_writes
0
Innodb_os_log_written
149354686976
Innodb_page_size
16384
Innodb_pages_created
1520762
Innodb_pages_read
44358411
Innodb_pages_written
17719344
Innodb_row_lock_current_waits
0
Innodb_row_lock_time
125773
Innodb_row_lock_time_avg
41
Innodb_row_lock_time_max
51019
Innodb_row_lock_waits
3015
Innodb_rows_deleted
0
Innodb_rows_inserted
47941507
Innodb_rows_read
1442492244
Innodb_rows_updated
3768685
Innodb_num_open_files
42
Innodb_truncated_status_writes
0
Innodb_available_undo_logs
128
Key_blocks_not_flushed
0
Key_blocks_unused
6696
Key_blocks_used
2
Key_read_requests
376
Key_reads
6
Key_write_requests
0
Key_writes
0
Locked_connects
0
Max_execution_time_exceeded
0
Max_execution_time_set
0
Max_execution_time_set_failed
0
Max_used_connections
795
Max_used_connections_time
2015-09-06 21:13:06
Not_flushed_delayed_rows
0
Ongoing_anonymous_transaction_count
0
Open_files
32
Open_streams
0
Open_table_definitions
132
Open_tables
2000
Opened_files
384
Opened_table_definitions
132
Opened_tables
26470
Performance_schema_accounts_lost
0
Performance_schema_cond_classes_lost
0
Performance_schema_cond_instances_lost
0
Performance_schema_digest_lost
0
Performance_schema_file_classes_lost
0
Performance_schema_file_handles_lost
0
Performance_schema_file_instances_lost
0
Performance_schema_hosts_lost
0
Performance_schema_index_stat_lost
0
Performance_schema_locker_lost
0
Performance_schema_memory_classes_lost
0
Performance_schema_metadata_lock_lost
0
Performance_schema_mutex_classes_lost
0
Performance_schema_mutex_instances_lost
0
Performance_schema_nested_statement_lost
0
Performance_schema_prepared_statements_lost
0
Performance_schema_program_lost
0
Performance_schema_rwlock_classes_lost
0
Performance_schema_rwlock_instances_lost
0
Performance_schema_session_connect_attrs_lost
0
Performance_schema_socket_classes_lost
0
Performance_schema_socket_instances_lost
0
Performance_schema_stage_classes_lost
0
Performance_schema_statement_classes_lost
0
Performance_schema_table_handles_lost
0
Performance_schema_table_instances_lost
0
Performance_schema_table_lock_stat_lost
0
Performance_schema_thread_classes_lost
0
Performance_schema_thread_instances_lost
0
Performance_schema_users_lost
0
Prepared_stmt_count
0
Qcache_free_blocks
1
Qcache_free_memory
1031832
Qcache_hits
0
Qcache_inserts
0
Qcache_lowmem_prunes
0
Qcache_not_cached
225057348
Qcache_queries_in_cache
0
Qcache_total_blocks
1
Queries
493783842
Questions
493783908
Variable_name
Value

Select_full_join
56
Select_full_range_join
0
Select_range
158
Select_range_check
0
Select_scan
8023
Slave_open_temp_tables
0
Slow_launch_threads
0
Slow_queries
0
Sort_merge_passes
3948
Sort_range
49
Sort_rows
490845
Sort_scan
129
Ssl_accept_renegotiates
0
Ssl_accepts
0
Ssl_callback_cache_hits
0
Ssl_cipher
Ssl_cipher_list
Ssl_client_connects
0
Ssl_connect_renegotiates
0
Ssl_ctx_verify_depth
0
Ssl_ctx_verify_mode
0
Ssl_default_timeout
0
Ssl_finished_accepts
0
Ssl_finished_connects
0
Ssl_server_not_after
Aug 29 23:50:49 2025 GMT
Ssl_server_not_before
Sep  1 23:50:49 2015 GMT
Ssl_session_cache_hits
0
Ssl_session_cache_misses
0
Ssl_session_cache_mode
Unknown
Ssl_session_cache_overflows
0
Ssl_session_cache_size
0
Ssl_session_cache_timeouts
0
Ssl_sessions_reused
0
Ssl_used_session_cache_entries
0
Ssl_verify_depth
0
Ssl_verify_mode
0
Ssl_version
Table_locks_immediate
6801
Table_locks_waited
0
Table_open_cache_hits
489970063
Table_open_cache_misses
26470
Table_open_cache_overflows
24463
Tc_log_max_pages_used
0
Tc_log_page_size
0
Tc_log_page_waits
0
Threads_cached
423
Threads_connected
370
Threads_created
795
Threads_running
310
Uptime
57104
Uptime_since_flush_status
57104
iostat

Linux 3.16.0-4-amd64 (vultr.guest)      09/06/2015      _x86_64_        (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          46.63    0.00    8.83    1.86    0.31   42.37

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
vda            1149.87      4598.30      3737.19  403966815  328316761

运行debian Jessie,MySQL 5.7,使用vultr.com提供的服务

好吧,经过一晚上的调整和观看这个奇妙的工具之后——我关注的是网站表,该表正在通过INSERT IGNORE INTO websites host=somehost.com进行更新——这是一个相当简单的查询,运行速度非常快。然而,在我的例子中,数据库中有5亿个域,一些非常常见的域(如w3.org)总是出现,并且无缘无故地不断被重新发送到母舰。我的解决方案是在每个节点上创建一个较小的只支持主机的数据库,所有被看到的主机都存储在这个数据库中——如果主机被看到,它将永远不会再被发送到母舰

现在,通常您会运行一个简单的perl哈希来跟踪这一点,但由于我的脚本对孩子们使用Parallel::Forkmanager,所以他们实际上不会相互交谈


目前我的吞吐量已经翻了一番,在用整个域列表填写每个节点上的小主机表之后,速度会更快。我可能会对其他具有相同INSERT IGNORE查询的表执行此操作,并进一步加快速度-但目前我每天都在达到目标,并将满足于此…

在所有这些中,您没有提到编程语言。所以回到它的语言,而不是这个问题,并烘焙分析。使用PERL&通过DBI(在内部gbit网络上)写入远程MySQL,我觉得这无关紧要,因为脚本在不保存到MySQL时可以访问无限多的网站。