Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/288.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 抓取Bing翻译的问题_Python_Json_Python 3.x_Web Scraping - Fatal编程技术网

Python 抓取Bing翻译的问题

Python 抓取Bing翻译的问题,python,json,python-3.x,web-scraping,Python,Json,Python 3.x,Web Scraping,幸运的是,我遇到了一个问题,经过几天的研究,我来到这里寻求帮助。我有一个代码,传说中说,用来工作。它从Bing Translator获取输出。 现在人们无法让它发挥作用。我以前从未接触过Web Scraping和Python,虽然我花了5天时间学习了Python的Web Scraping,但我仍然无法确定问题所在 嗯,网站上有一些更新。我试着在头条帖子上更新它,但它仍然不起作用,并且返回相同的错误 import re import json import os import argparse f

幸运的是,我遇到了一个问题,经过几天的研究,我来到这里寻求帮助。我有一个代码,传说中说,用来工作。它从Bing Translator获取输出。 现在人们无法让它发挥作用。我以前从未接触过Web Scraping和Python,虽然我花了5天时间学习了Python的Web Scraping,但我仍然无法确定问题所在

嗯,网站上有一些更新。我试着在头条帖子上更新它,但它仍然不起作用,并且返回相同的错误

import re
import json
import os
import argparse
from argparse import RawTextHelpFormatter
import getpass
import sys
import requests
from bs4 import UnicodeDammit


def bing_translator(text, from_, to_, proxy, user, sp):
    url = 'https://www.bing.com/ttranslate?&category=&IG=27062B7B735B438C891A574ACC3E57DE&IID=translator.5034.1'

    if type(user) != type(None):
        user_auth = user
        if type(sp) != type(None):
            user_auth = user_auth + ":" + sp
    #user_auth = "fg824gk:Abc.123456"

    if type(proxy) != type(None):
        if type(user) != type(None):
            proxy_Dict = {"http":"http://"+user_auth+"@"+proxy+":8080","https":"http://"+user_auth+"@"+proxy+":8443"}
        else:
            proxy_Dict = {"http":"http://"+proxy+":8080","https":"http://"+proxy+":8443"}

    post_header = {}

    post_header['Host'] = 'www.bing.com'
    post_header['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0'
    post_header['Accept'] = '*/*'
    post_header['Accept-Language'] = 'en-US,en;q=0.5'
    post_header['Accept-Encoding'] = 'gzip, deflate'
    post_header['Referer'] = 'https://www.bing.com/'
    post_header['Content-Type'] = 'application/x-www-form-urlencoded'
    post_header['Connection'] = 'keep-alive'

    data_payload = {'text' : text, 'from' : from_, 'to' : to_}

    parameters_payload = {'IG' : '839D27F8277F4AA3B0EDB83C255D0D70', 'IID' : 'translator.5033.3'}

    if type(proxy) != type(None):
        page = requests.post(url, headers = post_header, data=data_payload, proxies = proxy_Dict) #, params = payload_paramters)
    else:
        page = requests.post(url, headers = post_header, data=data_payload)

    try:
        j = json.loads(page.content[page.content.find('{'.encode("utf8")):page.content.find('}'.encode("utf8"))+1])
    except ValueError:
        print ('\n\n')
        print (data_payload)
        print ('\n\n')
        print ('\n\n')
        print (len(page.content))
        print (page.content)
        for i in page.content:
            print (ord(i))
        print ('\n\n')
        print ('\n\n')
        sys.exit(1)

    return j
当我运行bingPlatypus,en,es,None,None,None时,预期的输出将是OrnitorInCo,但我得到了这个错误:


{'text': 'Platypus', 'from': 'en', 'to': 'es'}






88955

Traceback (most recent call last):
  File "C:/Users/NikolasP/Desktop/adsdsadsaadsadsa.py", line 48, in bing_translator
    j = json.loads(page.content[page.content.find('{'.encode("utf8")):page.content.find('}'.encode("utf8"))+1])
  File "C:\Users\NikolasP\AppData\Local\Programs\Python\Python37-32\lib\json\__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "C:\Users\NikolasP\AppData\Local\Programs\Python\Python37-32\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\NikolasP\AppData\Local\Programs\Python\Python37-32\lib\json\decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<pyshell#27>", line 1, in <module>
    bing_translator("Platypus","en","es",None,None,None)
  File "C:/Users/NikolasP/Desktop/adsdsadsaadsadsa.py", line 57, in bing_translator
    print (ord(i))
TypeError: ord() expected string of length 1, but int found


您需要更改url以检查在浏览器中发布的请求。只需在ttranslate之后添加v3

输出

[{'detectedLanguage': {'language': 'en', 'score': 1.0}, 'translations': [{'text': 'Ornitorrinco', 'to': 'es'}]}]

错误为json.decoder.JSONDecodeError:应为包含在双引号中的属性名称:第1行第2列char 1,代码为data_payload={'text':text,'from':from_,'to':to_}。您是否尝试将其更改为data_payload={text:text,from:from_u,to:to_u}?而且,我认为这不是真正的MVCE。您应该能够更多地隔离错误,例如,您标记了此web刮片,但错误似乎不是由刮片本身引起的,而是由您对刮片数据所做的操作引起的。要进行调试,首先打印您得到的原始响应,然后再进行任何操作,这样你和我们就知道你在处理什么了Bing为他们的翻译API提供了一个免费的层
[{'detectedLanguage': {'language': 'en', 'score': 1.0}, 'translations': [{'text': 'Ornitorrinco', 'to': 'es'}]}]