Nginx+;PHP-FPM偶尔返回502

Nginx+;PHP-FPM偶尔返回502,php,nginx,fpm,Php,Nginx,Fpm,这个问题已经被问了很多次,但没有一个答案是有帮助的。经过几个小时的挖掘,我在这里寻求帮助。我是一个系统管理员经验有限的开发人员,但因为我们的ops人员离开了,我不得不尝试保持事情的活力 在我们的一个网站上,我们最近开始随机获得502个错误。这种情况经常发生,每天至少发生十几次(正如nagios和我们的用户所报告的)。我不知道有任何配置更改。web堆栈是标准的—nginx服务器将请求代理到php fpm,后者运行基于wordpress的应用程序 nginx错误日志包含许多类似以下的消息: [er

这个问题已经被问了很多次,但没有一个答案是有帮助的。经过几个小时的挖掘,我在这里寻求帮助。我是一个系统管理员经验有限的开发人员,但因为我们的ops人员离开了,我不得不尝试保持事情的活力

在我们的一个网站上,我们最近开始随机获得502个错误。这种情况经常发生,每天至少发生十几次(正如nagios和我们的用户所报告的)。我不知道有任何配置更改。web堆栈是标准的—nginx服务器将请求代理到php fpm,后者运行基于wordpress的应用程序


nginx错误日志包含许多类似以下的消息:

[error] 31180#31180: *451395 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: x.x.x.x, server: x.x, request: "GET /x/x/ HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "x.x.x"
pm.max_children = 100
pm.start_servers = 24
pm.min_spare_servers = 4
pm.max_spare_servers = 64
pm.max_requests = 500
其中大多数来自客户端IP,即服务器本身的IP(不确定为什么,可能是一些监控?),但也有来自随机公共IP的错误

PHP-FPM日志大约每小时都会发出如下警告:

WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 71 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 75 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 79 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 83 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 87 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 91 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 95 total children
WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 99 total children
WARNING: [pool www] server reached pm.max_children setting (100), consider raising it
我尝试过的事情 重新启动 很明显,但毫无帮助

增加资源,PHP-FPM子进程
  • 增加可用RAM,CPU没有帮助。磁盘未满,索引节点未完全使用
  • 随着资源的增加,我将
    pm.max\u children
    设置为100。最初是40岁,经过多年的运营,这还可以。在我看到原木后,我试着把它调高到75,然后调高到100
  • 另一个访问者数倍多的网站硬件较少,运行良好。这个网站不提供任何困难的内容,主要是博客
  • 为了完成,FPM配置如下所示:

    [error] 31180#31180: *451395 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: x.x.x.x, server: x.x, request: "GET /x/x/ HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "x.x.x"
    
    pm.max_children = 100
    pm.start_servers = 24
    pm.min_spare_servers = 4
    pm.max_spare_servers = 64
    pm.max_requests = 500
    
  • 日志中也没有提到运行OOM

调查opcache
  • 我读到opcache内存不足可能是罪魁祸首。唉,它有多余的内存:

    Cache hits  89757614
    Cache misses    1174
    Used memory 58333696
    Free memory 75884032
    Wasted memory   0
    OOM restarts    0
    
Nginx超时
  • Nginx参数不应该成为问题,因为缓冲区和超时值似乎非常大(我假设3000的单位是秒):

其他信息
  • PHP-FPM没有崩溃,除了关于孩子的警告之外,它的日志中没有任何内容
  • xdebug已禁用
  • syslog、dmesg不包含任何相关消息
  • php7.0,nginx 1.12.2

还有什么我可以试试的吗


指向无效内容的链接

你能对PHP继续进行的一些PHP的调查/分析,为什么它的代码>服务器达到PM.Max子设置(100),考虑提高它< /代码>。可能是子进程占用了更多的时间,线程被阻塞等导致新的子进程生成以处理新的请求?我还没有研究这个问题。你有什么建议从这里开始吗?