使用Ruby和Nokogiri/Mechanize从webforms params asp
我目前正在尝试使用ruby从带有Nokogiri和Mechanize的网页中获取数据。我想从下一个链接中获取一份投标清单: --遵循这个程序---使用Ruby和Nokogiri/Mechanize从webforms params asp,ruby,web-scraping,mechanize-ruby,Ruby,Web Scraping,Mechanize Ruby,我目前正在尝试使用ruby从带有Nokogiri和Mechanize的网页中获取数据。我想从下一个链接中获取一份投标清单: --遵循这个程序--- 打开url: 已打开,请转到字段:Número 来自Número油田的数值为:2017-1-37-0-15-cm-011063 按下第一个绿色按钮:客车 下到下面的表格中查看已过滤的标书 这是我的代码: 需要“rubygems” 需要“机械化” a=机械化新do |代理| agent.user\u agent\u别名='Mac Safari'
这是我的代码:
需要“rubygems”
需要“机械化”
a=机械化新do |代理|
agent.user\u agent\u别名='Mac Safari'
agent.follow\u meta\u refresh=true
结束
@url='1〕http://www.panamacompra.gob.pa/ambientepublico/AP_BusquedaAvanzada.aspx'
@m=机械化
@有效载荷=“”
@正文第页=“”
@搜索字符串='2017-1-37-0-15-cm-011063'
@viewstate=“”
def set_有效载荷
{
“txtGSA'=>”,
“ctl00$ContentPlaceholder 1$TXTNumerLoadQuision'=>”,
'ctl00$ContentPlaceholder 1$TXTNOMBREADQUISION'=>',
“ctl00$contentPlaceholder 1$txtNombreDemandante'=>”,
“ctl00$contentPlaceholder 1$txtNombreDependencia'=>”,
“ctl00$ContentPlaceholder 1$TXTNOMBREProveeder'=>”,
“ctl00$contentPlaceholder 1$txtFechaDesde”=>“13-02-2017”,
“ctl00$ContentPlaceholder 1$TXTFECHAHAHASTA”=>“13-03-2017”,
“ctl00$contentPlaceholder 1$txtNombreRubro'=>”,
“ctl00\u contentplaceholder 1\u ASPxPopupControl1WS'=>“0:0:-1:0:0:0:0:;0:0:-1:0:0:0:0:”,
'ctl00$ContentPlaceholder 1$ControlPagination$hidTotalPaginas'=>'0',
'ctl00$ContentPlaceholder 1$ControlPagination$hidNumeroPagina'=>'1',
'ctl00$ContentPlaceholder 1$ControlPagination$hidOrigen'=>'0',
'ctl00$ContentPlaceholder 1$ControlPagination$hidTotalFilas'=>'1',
'ctl00$ContentPlaceholder 1$ControlPaginAction$HidinicioPrevious'=>'1',
'ctl00$ContentPlaceholder 1$ControlPagination$HidfinaInterior'=>'1',
'ctl00$ContentPlaceholder 1$ControlPagination$hidBloqueInicio'=>'1',
'ctl00$ContentPlaceholder 1$ControlPagination$hidMaxFilasPorPagina'=>'20',
'ctl00$ContentPlaceholder 1$ControlPagination$hidMaxPaginasPorListado'=>'9',
'ctl00$ContentPlaceholder 1$ControlPagination$hidCambioBloque'=>'False',
'ctl00$ContentPlaceholder 1$ControlPagination$hidMostrarEstado'=>'False',
'ctl00$ContentPlaceholder 1$ControlPagination$hidMostrarMensaje'=>'True',
“ctl00$ContentPlaceholder 1$ControlPagination$hidValoresPorDefecto'=>“True”,
'ctl00$ContentPlaceholder 1$hidIdDependencia'=>'-1',
'ctl00$ContentPlaceholder 1$hidNombreDependencia'=>'-1',
'ctl00$ContentPlaceholder 1$hidIdOrgV'=>'-1',
“ctl00$ContentPlaceholder 1$HiddempResaventa'=>”-1',
“ctl00$ContentPlaceholder 1$HiddedMPResac'=>“0”,
'ctl00$ContentPlaceHolder1$hidIdOrgC'=>'-1',
'ctl00$ContentPlaceholder 1$hidNombreDemandante'=>'-1',
'ctl00$ContentPlaceholder 1$hidDependencia'=>'-1',
'ctl00$ContentPlaceHolder1$hidIDRubro'=>'-1',
“ctl00$ContentPlaceholder 1$hidRedir'=>”,
“ctl00$ContentPlaceholder 1$hidRangoMaximoFecha'=>”,
'ctl00$ContentPlaceholder 1$HIDDProducto'=>'-1',
'ctl00$ContentPlaceholder 1$HidIDProductOnOningResado'=>'-1',
'ctl00$ContentPlaceholder 1$hidNombreProducto'=>'-1',
'ctl00$ContentPlaceholder 1$HidnomBreProveeder'=>'-1',
“ctl00$contentplaceholder 1$lstunidadcompa'=>”,
“ctl00$contentPlaceholder 1$lstEstado'=>“0”
}
结束
```
```
@m、 获取@url do |页面|
page.form_with:name=>“aspnetForm”do | search_form|
@viewstate=search_form.field_,带有(:name=>“uu viewstate”).value
@有效载荷=设置有效载荷
@m、 post(@url,@payload).form_with:name=>“aspnetForm”do | search_form_2|
使用(:name=>“ctl00$ContentPlaceHolder1$txtnumeroadquisition”)搜索表单2.字段。值=@search\u字符串
提交按钮=搜索表单2。按钮带有(:id=>“ctl00\u内容占位符1\u btnBuscar”)
完成=搜索表单2.提交(提交按钮)
@正文第页=完成
结束
放置Nokogiri::HTML(@body_page.body)
结束
结束
很可能他们正在使用Ajax更新页面并检索数据,在这种情况下,Mechanize无法帮助您。你检查过了吗?是的,我试过了,我得到了信息,但我想用这些宝石和ruby获得数据。有没有其他方法来实现这种做法?谢谢你的提醒。