Python:scraí;通过xpath使用数据生成html表

Python:scraí;通过xpath使用数据生成html表,python,xpath,Python,Xpath,我无法从html表(标记tbody)中提取数据。我很想在这里被证明是错的 这是我的密码: import lxml.html as LH import requests import pandas as pd from datetime import datetime start_time = datetime.now() def text(elt): return elt.text_content().replace(u'\xa0', u' ') try: url = 'h

我无法从html表(标记tbody)中提取数据。我很想在这里被证明是错的

这是我的密码:

import lxml.html as LH
import requests
import pandas as pd
from datetime import datetime

start_time = datetime.now()

def text(elt):
    return elt.text_content().replace(u'\xa0', u' ')

try:
    url = 'https://www.byma.com.ar/acciones/panel/general'
    r = requests.get(url)    
except requests.exceptions.Timeout as e:        
    print e
    sys.exit(1)
except requests.exceptions.TooManyRedirects as e:
    print e
    sys.exit(1)
except requests.exceptions.RequestException as e:    
    print e
    sys.exit(1)


root = LH.fromstring(r.content)

for table in root.xpath('//*[@id="dataStocks"]'):
    header = [text(th) for th in table.xpath('//*[@id="dataStocks"]/thead')]                
    data = [[text(td) for td in tr.xpath('//*[@id="dataStocks"]/tbody/tr')]
         for tr in table.xpath('//tr')]                   
    data = [row for row in data if len(row)==len(header)]     
    data = pd.DataFrame(data, columns=header)                
    print(data)

只有头列:S

您想要获取的值是初始页面源中没有的动态数据,但从XHR接收。您可以通过以下方式获得这些值:

import requests
import json

url = "https://www.byma.com.ar/wp-admin/admin-ajax.php?action=get_panel&panel_id=2"
response = requests.get(url)
data = response.json()

for entry in data["Cotizaciones"]:
    print(entry)
每个
条目的输出类似于

{'Apertura': 8.5, 'Cantidad_Nominal_Compra': 17346, 
'Cantidad_Nominal_Venta': 21
569, 'Cantidad_Operaciones': '2409', 'Cierre_Anterior': 8.65, 'Denominacion': 'G
RUPO FINANCIERO VALORES SOCIEDAD ANONIMA', 'Estado': '', 'Ex': 'No', 'Hora_Cotiz
acion': '17:05:53', 'Maximo': 8.54, 'Minimo': 7.95, 'Monto_Operado_Pesos': 89376
607, 'Precio_Compra': 7.95, 'Precio_Promedio': 8.21, 'Precio_Promedio_Ponderado'
: 8.1886, 'Precio_Venta': 7.96, 'Simbolo': 'VALO', 'Tendencia': 0, 'Tipo_Liquida
cion': 'Pesos', 'Ultimo': 7.96, 'Variacion': -7.98, 'Vencimiento': '48hs', 'Volu
men_Nominal': 10896556}
您还可以分别从
条目
中获取每个值,例如

print(entry['Apertura'])
输出:

8.5

如何获取此url?在浏览器中打开,按F12键,在开发控制台中切换到“网络”选项卡,仅启用XHR子选项卡,您将看到发送到的请求