Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/javascript/464.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/363.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript Web抓取交互式图表_Javascript_Python_Html_Web Scraping - Fatal编程技术网

Javascript Web抓取交互式图表

Javascript Web抓取交互式图表,javascript,python,html,web-scraping,Javascript,Python,Html,Web Scraping,我看到有一些关于这方面的帖子,但每种情况显然都是独一无二的。我正在尝试获取本页图表背后的数据: 这是一个相当模糊的市场指数,无法通过雅虎获得,而雅虎正是我通常关注的地方(特别是python中的web.DataReader),这是为数不多的几个有完整每日价格的地方之一 <script nonce="XL1oARYPz8X2tvqk"> window.__defaultsOverrides = { 'mainSeriesProperties.s

我看到有一些关于这方面的帖子,但每种情况显然都是独一无二的。我正在尝试获取本页图表背后的数据:

这是一个相当模糊的市场指数,无法通过雅虎获得,而雅虎正是我通常关注的地方(特别是python中的
web.DataReader
),这是为数不多的几个有完整每日价格的地方之一

<script nonce="XL1oARYPz8X2tvqk">
    window.__defaultsOverrides = {
        'mainSeriesProperties.style': 3,
        'mainSeriesProperties.areaStyle.priceSource': 'close',
        'scalesProperties.lineColor': 'rgba( 76, 82, 94, 1)',
        'scalesProperties.showSymbolLabels': false,
        'scalesProperties.textColor': 'rgba( 76, 82, 94, 1)',
        'scalesProperties.seriesLastValueMode': 0,
        'paneProperties.topMargin': 13,
        'paneProperties.legendProperties.showStudyArguments': false,
        'paneProperties.legendProperties.showStudyTitles': false,
        'paneProperties.legendProperties.showStudyValues': false,
        'paneProperties.legendProperties.showSeriesTitle': false,
        'paneProperties.legendProperties.showSeriesOHLC': true,
        'paneProperties.legendProperties.showLegend': false,
    };
</script>

窗口。\uuuu defaultsOverrides={
“MainSeriesProperty.style”:3,
“mainSeriesProperties.areaStyle.priceSource':“close”,
“scaleProperty.lineColor”:“rgba(76,82,94,1)”,
“ScaleProperties.showSymbolLabels”:false,
'scaleProperties.textColor':'rgba(76,82,94,1)',
“ScaleProperties.seriesLastValueMode”:0,
“paneProperties.topMargin”:13,
“paneProperties.legendProperties.showStudyArguments”:false,
“paneProperties.legendProperties.showStudyTitles”:false,
“paneProperties.legendProperties.ShowStudyValue”:false,
“paneProperties.legendProperties.ShowSerieStile”:false,
“paneProperties.legendProperties.ShowSeriesHolc”:true,
“paneProperties.legendProperties.showLegend”:false,
};
这就是显示为与图表相关的元素的内容,坦白地说,就web开发而言,这有点超出我的理解范围,因为它只是一个脚本标记(即,它不仅仅是图表元素的子元素,而是图表元素)。我尝试在JS文件中搜索
XL1oARYPz8X2tvqk
的nonce值,但没有看到任何看起来会填充图表的内容


我想我可以在window对象的某个地方找到图表数据,但我没有看到它。有没有一个简单的方法来追踪这个?我知道我可以使用交互式刮板,但它似乎必须比这更简单。

数据是从以下站点的websocket连接检索的:

wss://data.tradingview.com/socket.io/websocket?from=symbols%2FNASDAQ-VOLI%2F
您可以通过发送命令并从此websocket接收数据来获取这些数据。您可以看到从Chrome开发控制台接收和发送的所有消息:

格式是一个JSON对象流(每个响应可以有多个对象),带有一些前缀,如
~m~23+~m~
。因此,有必要使用正则表达式(中间的数字变化)拆分响应

您可以在上面的屏幕截图中看到许多要发送的消息(绿色消息),但我们只对那些使用“图表会话令牌”的人感兴趣,例如控制图表而不是引用的命令

在开头发送以下消息:

{"m": "set_data_quality", "p": ["low"]},
{"m": "set_auth_token", "p": ["unauthorized_user_token"]},
{"m":"chart_create_session","p":[chartSession,""]},
{"m":"resolve_symbol","p":[chartSession,"symbol_1","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
{"m":"create_series","p":[chartSession,"s1","s1","symbol_1","D",300]},
{"m":"switch_timezone","p":[chartSession,"Etc/UTC"]},
{"m":"resolve_symbol","p":[chartSession,"symbol_2","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
{"m":"modify_series","p":[chartSession,"s1","s2","symbol_2","D,12M"]},
之后,您将收到一条带有值
timescale\u update
以及图表数据等信息的响应

以下脚本启动websocket连接,发送获取图表数据所需的初始消息,并使用保存为png的图形构建图形:

import json 
import websockets
import urllib
import asyncio
import re
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

wsParams = {
    "from": "symbols/NASDAQ-VOLI/"
}
websocketUri = f"wss://data.tradingview.com/socket.io/websocket?{urllib.parse.urlencode(wsParams)}"

result = []
chartSession = "cs_Dj1BV8ochLL0"

initMessages = [
    {"m": "set_data_quality", "p": ["low"]},
    {"m": "set_auth_token", "p": ["unauthorized_user_token"]},
    {"m":"chart_create_session","p":[chartSession,""]},
    {"m":"resolve_symbol","p":[chartSession,"symbol_1","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
    {"m":"create_series","p":[chartSession,"s1","s1","symbol_1","D",300]},
    {"m":"switch_timezone","p":[chartSession,"Etc/UTC"]},
    {"m":"resolve_symbol","p":[chartSession,"symbol_2","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
    {"m":"modify_series","p":[chartSession,"s1","s2","symbol_2","D,12M"]},
]

def strip(text):
    noDataReg = re.match('~m~\d+~m~~h~\d+', text, re.MULTILINE)
    if not noDataReg:
        dataReg = re.split('~m~\d+~m~', text)
        return [json.loads(t) for t in dataReg if t]
    return []

def unstrip(text):
    return f"~m~{len(text)-8}~m~{json.dumps(text)}"

async def init(websocket):
    for m in initMessages:
        await websocket.send(unstrip(m))

async def startReceiving(websocket):
    data = await websocket.recv()
    print(strip(data))
    await init(websocket)
    while(True):
        data = await websocket.recv()
        payloads = strip(data)
        for p in payloads:
            if p["m"] == "timescale_update":
                dates = [
                    datetime.fromtimestamp(t["v"][0])
                    for t in p["p"][1]["s1"]["s"]
                ]
                values = [
                    t["v"][4]
                    for t in p["p"][1]["s1"]["s"]
                ]
                plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d/%m/%Y'))
                plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=25))
                plt.plot(dates, values)
                plt.gcf().autofmt_xdate()
                plt.ylabel('VOLI Index Chart')
                plt.xlabel('Date')
                plt.savefig("voli.png")
        print(payloads)

async def websocketConnect():
    async with websockets.client.connect(websocketUri, extra_headers= {
            "Origin": "https://www.tradingview.com"
        }) as websocket:
        print(f'started websocket')
        await startReceiving(websocket)

asyncio.get_event_loop().run_until_complete(websocketConnect())

以及生成的图表:

请注意:

  • 为了成功连接到websocket服务器,您需要发送带有正确值的
    Origin
    头,否则返回403

  • 图表会话令牌在这里是硬编码的,但它可以是任何东西,它似乎是在网站上随机生成的(使用正则表达式模式)

  • 我已删除所有关于引号的websocket消息,您需要添加此类消息以接收有关“实时”值更改的通知(将添加到init消息):

请注意,
quote\u create\u session
对于新的会话令牌(!=来自图表会话令牌)是必需的。然后您将通过websocket接收通知

  • 如果您想接收通知,请注意有一个keep-alive,如果您在x段时间内没有发送任何内容,它会自动关闭websocket。您只需定期发送以下命令:

    ~m~4~m~~h~1
    

从以下站点的websocket连接检索数据:

wss://data.tradingview.com/socket.io/websocket?from=symbols%2FNASDAQ-VOLI%2F
您可以通过发送命令并从此websocket接收数据来获取这些数据。您可以看到从Chrome开发控制台接收和发送的所有消息:

格式是一个JSON对象流(每个响应可以有多个对象),带有一些前缀,如
~m~23+~m~
。因此,有必要使用正则表达式(中间的数字变化)拆分响应

您可以在上面的屏幕截图中看到许多要发送的消息(绿色消息),但我们只对那些使用“图表会话令牌”的人感兴趣,例如控制图表而不是引用的命令

在开头发送以下消息:

{"m": "set_data_quality", "p": ["low"]},
{"m": "set_auth_token", "p": ["unauthorized_user_token"]},
{"m":"chart_create_session","p":[chartSession,""]},
{"m":"resolve_symbol","p":[chartSession,"symbol_1","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
{"m":"create_series","p":[chartSession,"s1","s1","symbol_1","D",300]},
{"m":"switch_timezone","p":[chartSession,"Etc/UTC"]},
{"m":"resolve_symbol","p":[chartSession,"symbol_2","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
{"m":"modify_series","p":[chartSession,"s1","s2","symbol_2","D,12M"]},
之后,您将收到一条带有值
timescale\u update
以及图表数据等信息的响应

以下脚本启动websocket连接,发送获取图表数据所需的初始消息,并使用保存为png的图形构建图形:

import json 
import websockets
import urllib
import asyncio
import re
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

wsParams = {
    "from": "symbols/NASDAQ-VOLI/"
}
websocketUri = f"wss://data.tradingview.com/socket.io/websocket?{urllib.parse.urlencode(wsParams)}"

result = []
chartSession = "cs_Dj1BV8ochLL0"

initMessages = [
    {"m": "set_data_quality", "p": ["low"]},
    {"m": "set_auth_token", "p": ["unauthorized_user_token"]},
    {"m":"chart_create_session","p":[chartSession,""]},
    {"m":"resolve_symbol","p":[chartSession,"symbol_1","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
    {"m":"create_series","p":[chartSession,"s1","s1","symbol_1","D",300]},
    {"m":"switch_timezone","p":[chartSession,"Etc/UTC"]},
    {"m":"resolve_symbol","p":[chartSession,"symbol_2","={\"symbol\":\"NASDAQ:VOLI\",\"adjustment\":\"splits\",\"session\":\"extended\"}"]},
    {"m":"modify_series","p":[chartSession,"s1","s2","symbol_2","D,12M"]},
]

def strip(text):
    noDataReg = re.match('~m~\d+~m~~h~\d+', text, re.MULTILINE)
    if not noDataReg:
        dataReg = re.split('~m~\d+~m~', text)
        return [json.loads(t) for t in dataReg if t]
    return []

def unstrip(text):
    return f"~m~{len(text)-8}~m~{json.dumps(text)}"

async def init(websocket):
    for m in initMessages:
        await websocket.send(unstrip(m))

async def startReceiving(websocket):
    data = await websocket.recv()
    print(strip(data))
    await init(websocket)
    while(True):
        data = await websocket.recv()
        payloads = strip(data)
        for p in payloads:
            if p["m"] == "timescale_update":
                dates = [
                    datetime.fromtimestamp(t["v"][0])
                    for t in p["p"][1]["s1"]["s"]
                ]
                values = [
                    t["v"][4]
                    for t in p["p"][1]["s1"]["s"]
                ]
                plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d/%m/%Y'))
                plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=25))
                plt.plot(dates, values)
                plt.gcf().autofmt_xdate()
                plt.ylabel('VOLI Index Chart')
                plt.xlabel('Date')
                plt.savefig("voli.png")
        print(payloads)

async def websocketConnect():
    async with websockets.client.connect(websocketUri, extra_headers= {
            "Origin": "https://www.tradingview.com"
        }) as websocket:
        print(f'started websocket')
        await startReceiving(websocket)

asyncio.get_event_loop().run_until_complete(websocketConnect())

以及生成的图表:

请注意:

  • 为了成功连接到websocket服务器,您需要发送带有正确值的
    Origin
    头,否则返回403

  • 图表会话令牌在这里是硬编码的,但它可以是任何东西,它似乎是在网站上随机生成的(使用正则表达式模式)

  • 我已删除所有关于引号的websocket消息,您需要添加此类消息以接收有关“实时”值更改的通知(将添加到init消息):

请注意,
quote\u create\u session
对于新的会话令牌(!=来自图表会话令牌)是必需的。然后您将通过websocket接收通知

  • 如果您想接收通知,请注意有一个keep-alive,如果您在x段时间内没有发送任何内容,它会自动关闭websocket。您只需定期发送以下命令:

    ~m~4~m~~h~1
    

我需要做一些实验,但是这个