Python 如何使用docker compose和haproxy实现phantomjs的负载平衡?
我有一个使用SeleniumWebDriver与PhantomJS接口的应用程序。为了扩大规模,我想运行多个PhantomJS实例,并使用haproxy对它们进行负载平衡。这是针对本地应用程序的,所以我不关心部署到生产环境或类似的环境 这是我的Python 如何使用docker compose和haproxy实现phantomjs的负载平衡?,python,selenium,phantomjs,docker-compose,haproxy,Python,Selenium,Phantomjs,Docker Compose,Haproxy,我有一个使用SeleniumWebDriver与PhantomJS接口的应用程序。为了扩大规模,我想运行多个PhantomJS实例,并使用haproxy对它们进行负载平衡。这是针对本地应用程序的,所以我不关心部署到生产环境或类似的环境 这是我的docker compose.yml文件: version: '2' services: app: build: . volumes: - .:/code links: - mongo - h
docker compose.yml
文件:
version: '2'
services:
app:
build: .
volumes:
- .:/code
links:
- mongo
- haproxy
mongo:
image: mongo
phantomjs1:
image: wernight/phantomjs:latest
ports:
- 8910
entrypoint:
- phantomjs
- --webdriver=8910
- --ignore-ssl-errors=true
- --load-images=false
phantomjs2:
image: wernight/phantomjs:latest
ports:
- 8910
entrypoint:
- phantomjs
- --webdriver=8910
- --ignore-ssl-errors=true
- --load-images=false
phantomjs3:
image: wernight/phantomjs:latest
ports:
- 8910
entrypoint:
- phantomjs
- --webdriver=8910
- --ignore-ssl-errors=true
- --load-images=false
phantomjs4:
image: wernight/phantomjs:latest
ports:
- 8910
entrypoint:
- phantomjs
- --webdriver=8910
- --ignore-ssl-errors=true
- --load-images=false
haproxy:
image: haproxy
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- 8910:8910
links:
- phantomjs1
- phantomjs2
- phantomjs3
- phantomjs4
如您所见,我有四个phantomjs实例、一个haproxy实例和一个应用程序(用python编写)
这是我的haproxy.cfg
:
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
maxconn 4096
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
timeout connect 5000
timeout client 50000
timeout server 50000
frontend phantomjs_front
bind *:8910
stats uri /haproxy?stats
default_backend phantomjs_back
backend phantomjs_back
balance roundrobin
server phantomjs1 phantomjs1:8910 check
server phantomjs2 phantomjs2:8910 check
server phantomjs3 phantomjs3:8910 check
server phantomjs4 phantomjs4:8910 check
我知道我需要在haproxy中使用sticky会话或其他方法来实现这一点,但我不知道如何做到这一点
下面是连接到此服务的python应用程序代码的相关片段:
def get_page(url):
driver = webdriver.Remote(
command_executor='http://haproxy:8910',
desired_capabilities=DesiredCapabilities.PHANTOMJS
)
driver.get(url)
source = driver.page_source
driver.close()
return source
尝试运行此代码时出现的错误如下:
phantomjs2_1 | [ERROR - 2016-07-12T23:35:25.454Z] RouterReqHand - _handle.error - {"name":"Variable Resource Not Found","message":"{\"headers\":{\"Accept\":\"application/json\",\"Accept-Encoding\":\"identity\",\"Connection\":\"close\",\"Content-Length\":\"96\",\"Content-Type\":\"application/json;charset=UTF-8\",\"Host\":\"172.19.0.7:8910\",\"User-Agent\":\"Python-urllib/3.5\"},\"httpVersion\":\"1.1\",\"method\":\"POST\",\"post\":\"{\\\"url\\\": \\\"\\\\\\\"http://www.REDACTED.com\\\\\\\"\\\", \\\"sessionId\\\": \\\"4eff6a60-4889-11e6-b4ad-095b9e1284ce\\\"}\",\"url\":\"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url\",\"urlParsed\":{\"anchor\":\"\",\"query\":\"\",\"file\":\"url\",\"directory\":\"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/\",\"path\":\"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url\",\"relative\":\"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url\",\"port\":\"\",\"host\":\"\",\"password\":\"\",\"user\":\"\",\"userInfo\":\"\",\"authority\":\"\",\"protocol\":\"\",\"source\":\"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url\",\"queryKey\":{},\"chunks\":[\"session\",\"4eff6a60-4889-11e6-b4ad-095b9e1284ce\",\"url\"]}}","line":80,"sourceURL":"phantomjs://code/router_request_handler.js","stack":"_handle@phantomjs://code/router_request_handler.js:80:82"}
phantomjs2_1 |
phantomjs2_1 | phantomjs://platform/console++.js:263 in error
app_1 | Traceback (most recent call last):
app_1 | File "selenium_process.py", line 69, in <module>
app_1 | main()
app_1 | File "selenium_process.py", line 61, in main
app_1 | source = get_page(args.url)
app_1 | File "selenium_process.py", line 52, in get_page
app_1 | driver.get(url)
app_1 | File "/usr/local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 248, in get
app_1 | self.execute(Command.GET, {'url': url})
app_1 | File "/usr/local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
app_1 | self.error_handler.check_response(response)
app_1 | File "/usr/local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 163, in check_response
app_1 | raise exception_class(value)
app_1 | selenium.common.exceptions.WebDriverException: Message: Variable Resource Not Found - {"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"96","Content-Type":"application/json;charset=UTF-8","Host":"172.19.0.7:8910","User-Agent":"Python-urllib/3.5"},"httpVersion":"1.1","method":"POST","post":"{\"url\": \"\\\"http://www.REDACTED.com\\\"\", \"sessionId\": \"4eff6a60-4889-11e6-b4ad-095b9e1284ce\"}","url":"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url","urlParsed":{"anchor":"","query":"","file":"url","directory":"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/","path":"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url","relative":"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/session/4eff6a60-4889-11e6-b4ad-095b9e1284ce/url","queryKey":{},"chunks":["session","4eff6a60-4889-11e6-b4ad-095b9e1284ce","url"]}}
app_1 |
然后,随着会话的进行,会话id在后续请求中作为URI的一部分发送到服务器,例如
GET/session/5a27f2b0-48a5-11e6-97d7-7f5820fc7aa6/source
。我怎样才能在haproxy中获取这些内容并将其用于粘性会话?您应该能够在haproxy配置本身中添加cookies
cookie SERVERID insert indirect nocache
server httpd1 10.0.0.19:9443 cookie httpd1 check
server httpd2 10.0.0.18:9443 cookie httpd2 check
然后会话将通过haproxy本身进行。FWIW,将会话绑定到请求的IP地址是没有用的,因为我的应用程序只有一个实例与不同的phantomjs服务器进行多个连接。你找到解决方案了吗?很遗憾,我没有。很遗憾,在这种情况下,客户不尊重cookies。它不是标准浏览器,而是Python中的selenium webdriver库。因为RPC API不使用cookie,所以库没有理由尊重它们。我只是针对selenium代码库提交了一份票证,以支持它们的RPC代码中的cookie。这将很容易解决这个问题。
cookie SERVERID insert indirect nocache
server httpd1 10.0.0.19:9443 cookie httpd1 check
server httpd2 10.0.0.18:9443 cookie httpd2 check