Python Can';无法为嵌套表获取正确的标记

Python Can';无法为嵌套表获取正确的标记,python,python-requests,token,nested-table,Python,Python Requests,Token,Nested Table,我试图以json文件的形式获取一些表的信息。问题是我好像找不到合适的桌子。看,有两个json文件,一个我可以在页面中得到,但这个文件只包含非嵌套信息,这个我可以得到。 问题似乎是嵌套的 我需要获得的表格打印: 我需要json文件中所有这些表的内容,但是,在这种情况下,我似乎无法获得正确的标记。他们总是返回登录页面,就好像会话已经过期一样 下面是我用来刮表的代码: #does a json post for the last 100 elements url = "https:/

我试图以json文件的形式获取一些表的信息。问题是我好像找不到合适的桌子。看,有两个json文件,一个我可以在页面中得到,但这个文件只包含非嵌套信息,这个我可以得到。 问题似乎是嵌套的

我需要获得的表格打印:

我需要json文件中所有这些表的内容,但是,在这种情况下,我似乎无法获得正确的标记。他们总是返回登录页面,就好像会话已经过期一样

下面是我用来刮表的代码:

    #does a json post for the last 100 elements
    url = "https://Awebsite.com/virtualaccount/entries"

    querystring = {"userId":"userid","moffset":"0"}

    payload = "sEcho=1&iColumns=4&sColumns=DateCtz%2CReason%2CDescription%2CFormattedAmount&iDisplayStart=0&iDisplayLength=100&mDataProp_0=DateCtz&mDataProp_1=Reason&mDataProp_2=Description&mDataProp_3=FormattedAmount&sSearch=&bRegex=false&sSearch_0=&bRegex_0=false&bSearchable_0=true&sSearch_1=&bRegex_1=false&bSearchable_1=false&sSearch_2=&bRegex_2=false&bSearchable_2=false&sSearch_3=&bRegex_3=false&bSearchable_3=false&sSortCol%5B0%5D=DateCtz&bSortDir%5B0%5D=false&iSortingCols=1&bSortable_0=true&bSortable_1=false&bSortable_2=false&bSortable_3=false"


    pgtos = session.post(url,params=querystring,data=payload,headers=headers)



    #gets the texts and converts it in python json
    json_data =  pgtos.text
    json1_data = json.loads(json_data)

    cooki = session.cookies.get_dict()

    coooookie = pgtos.cookies.get_dict()

    print 'cookies'
    print cooki
    print coooookie
    print cookie


    #here is the postman area im using... the problem is i can't seem to get the payload __requestverificationtoken right. If i use the postman one it works for a while before it expires.

    url = "https://Awebsite.com/virtualaccount/transactions"

    querystring = {"entryId":"<each one of the tables has a diferent entryId>"}

    payload = "__RequestVerificationToken=QMA2UREXdlRfwIagBWIjekZG4D1ykXrFXxtWnzWV3kc55529C26MyKbL4pHNbaiTjBBAvrrbIsZEroUBJPfc0zWam4nig9oOZxQOKJ2khnZlp2YqOgFgNAj8bYxMIiDGtc9sYBIZS6M_1o6jRAl8gQ2"

    #the postman headers, if i use the headers the postman passes me it works fine too, but in this case im trying to automate the area... so if i use the cookies i get from the site it simply won't work
    headers = {
        'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0",
        'Accept': "*/*",
        'Referer': "https://Awebsite.com/virtualaccount/view",
        'Content-Type': "application/x-www-form-urlencoded; charset=UTF-8",
        'X-Requested-With': "XMLHttpRequest",
        'Cookie': "_ga=GA1.0.000000000.0000000000; _gid=GA1.0.000000000.0000000000; __RequestVerificationToken="+cooki['__RequestVerificationToken']+"; e5ps_sid="+cooki['e5ps_sid']+"; .ASPXAUTH="+cooki['.ASPXAUTH']+"; _gat=1",          'Connection': "keep-alive",
        'Cache-Control': "no-cache",
        'Postman-Token': "..."
        }

    response = requests.request("POST", url, data=payload, headers=headers, params=querystring)

    print(response.text) #end of the postman test, if i use all the postman tokens and stuff it works for a time.



    #initiate the counter in 0
    i = 0
    #verify each of the items in the superior json file
    for index in json1_data['VirtualAccountEntries']:


        data = json1_data['VirtualAccountEntries'][i]['DateCtz']
        _date = datetime.date(int(data[6:10]),int(data[3:5]),int(data[0:2]))
        if current_day== _date or prvious_day == _data:

            url = "https://Awebsite.com/virtualaccount/transactions"
            querystring = {"entryId":json1_data['VirtualAccountEntries'][i]['Id']}

            payload = "__RequestVerificationToken=<the requestverification token i can't seem to get right>"

            payment = requests.request("POST", url, data=payload, headers=headers, params=querystring)

            print payment.text


            with open("C:\\... +".json" , "w") as fp:
                json.dump(pagamentos.content, fp)

        i +=1
#对最后100个元素进行json发布
url=”https://Awebsite.com/virtualaccount/entries"
querystring={“userId”:“userId”,“moffset”:“0”}
有效载荷="sEcho=1&iColumns=4&sColumns=DateCtz%2CReason%2CDescription%2CFormattedAmount&iDisplayStart=0&iDisplayLength=100&mDataProp\u 0=DateCtz&mDataProp\u 1=Reason&mDataProp\u 2=Description&mDataProp\u 3=FormattedAmount&sSearch=&bRegex=false&sSearch\u 0=&bRegex\u 0=false&bSearch\u 0=true&bSearch\u 1=&bRegex\u 1=false&bSearch表2=false&sSearch\u 3=&bRegex\u 3=false&BSearcable\u 3=false&sSortCol%5B0%5D=DateCtz&bSortDir%5B0%5D=false&BSortingcols=1&bSortable\u 0=true&bSortable\u 1=false&bSortable\u 2=false&bSortable\u 3=false“
pgtos=session.post(url,params=querystring,data=payload,headers=headers)
#获取文本并将其转换为python json
json_data=pgtos.text
json1_data=json.load(json_数据)
cooki=session.cookies.get_dict()
cooookie=pgtos.cookies.get_dict()
打印“cookies”
打印cooki
打印coooookie
打印cookie
#这是我使用的邮递员区域…问题是我似乎无法正确获取有效负载\uuu requestverificationtoken。如果我使用邮递员区域,它会在到期前工作一段时间。
url=”https://Awebsite.com/virtualaccount/transactions"
querystring={“entryId”:“”}
payload=“\uuuu RequestVerificationToken=qma2urexdlrfiagbwijekzg4d1ykxrfxxtwnzwv3kc5529c26mykbl4phnbaitjbbavrrbiszeroubjpfc0zwam4nig9ooxqokj2khnzlp2yqoggnaj8byxmiidgtc9sybiz6mu 1jral8gq2”
#邮递员邮件头,如果我使用邮递员递给我的邮件头,它也可以正常工作,但在这种情况下,我试图使该区域自动化…因此,如果我使用从网站获得的cookies,它根本不会工作
标题={
“用户代理”:“Mozilla/5.0(Windows NT 10.0;Win64;x64;rv:61.0)Gecko/20100101 Firefox/61.0”,
“接受”:“*/*”,
“Referer”:“https://Awebsite.com/virtualaccount/view",
“内容类型”:“application/x-www-form-urlencoded;charset=UTF-8”,
“X-request-With':“XMLHttpRequest”,
‘Cookie’:“_ga=GA1.0.000000000.0000000000;_gid=GA1.0.000000000.0000000000;uuu RequestVerificationToken=“+cooki[”uuuu RequestVerificationToken']+”;e5ps_sid=“+cooki['e5ps_sid']+”。ASPXAUTH=“+cooki['.ASPXAUTH'+”;gat=1“,“连接”:“保持活力”,
“缓存控制”:“无缓存”,
“邮递员代币”:“…”
}
response=requests.request(“POST”,url,data=payload,headers=headers,params=querystring)
打印(response.text)#邮递员测试结束,如果我使用所有邮递员代币和其他东西,它可以工作一段时间。
#在0中启动计数器
i=0
#验证高级json文件中的每个项目
对于json1_数据['VirtualAccountEntries']中的索引:
data=json1_data['VirtualAccountEntries'][i]['DateCtz']
_date=datetime.date(int(数据[6:10])、int(数据[3:5])、int(数据[0:2]))
如果当前日期===日期或前一天===数据:
url=”https://Awebsite.com/virtualaccount/transactions"
querystring={“entryId”:json1_数据['VirtualAccountEntries'][i]['Id']}
payload=“\uuu RequestVerificationToken=我成功地将其设置为正确。

在HTML代码的深处有一个带有uu RequestVerificationToken值的脚本,因此,当我发现这一点时,只需使用BeautifulSoup解析脚本并找到带有令牌的正确文本,然后将令牌传递给请求头,这样会话就不会过期。

就是这样。