Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在django中使用beautifulsoup从json提取数据_Json_Beautifulsoup_Django Views - Fatal编程技术网

如何在django中使用beautifulsoup从json提取数据

如何在django中使用beautifulsoup从json提取数据,json,beautifulsoup,django-views,Json,Beautifulsoup,Django Views,你好。我在尝试从json中提取值时遇到了一个问题。 首先,我的beautifulsoup在shell中工作得很好,但在django中不行。我还试图从接收到的json中提取数据,但没有成功。在我看来,这是一门课: class FetchWeather(generic.TemplateView): template_name = 'forecastApp/pages/weather.html' def get_context_data(self, **kwargs):

你好。我在尝试从json中提取值时遇到了一个问题。 首先,我的beautifulsoup在shell中工作得很好,但在django中不行。我还试图从接收到的json中提取数据,但没有成功。在我看来,这是一门课:

class FetchWeather(generic.TemplateView):
    template_name = 'forecastApp/pages/weather.html'

    def get_context_data(self, **kwargs):
        context = super().get_context_data(**kwargs)
        url = 'http://weather.news24.com/sa/cape-town'
        city = 'cape town'
        url_request = requests.get(url)
        soup = BeautifulSoup(url_request.content, 'html.parser')
        city_list = soup.find(id="ctl00_WeatherContentHolder_ddlCity")
        print(soup.head)
        city_as_on_website = city_list.find(text=re.compile(city, re.I)).parent
        cityId = city_as_on_website['value']
        json_url = "http://weather.news24.com/ajaxpro/TwentyFour.Weather.Web.Ajax,App_Code.ashx"

        headers = {
            'Content-Type': 'text/plain; charset=UTF-8',
            'Host': 'weather.news24.com',
            'Origin': 'http://weather.news24.com',
            'Referer': url,
            'User-Agent': 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/48.0.2564.82 Chrome/48.0.2564.82 Safari/537.36',
            'X-AjaxPro-Method': 'GetCurrentOne'}

        payload = {
            "cityId": cityId
        }
        request_post = requests.post(json_url, headers=headers, data=json.dumps(payload))
        print(request_post.content)
        context['Observations'] = request_post.content
        return context
在json中,有一个数组“Observations”,我试图从中获取城市名称,温度高低

但当我尝试这样做时:

cityDict = json.loads(str(html))
我收到一个错误。这是它的回溯:

 Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 4067 (char 4066)
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/_init__.py”,第319行,在loads中
返回\u默认\u解码器。解码
文件“/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py”,第339行,在decode中
obj,end=self.raw\u decode(s,idx=\u w(s,0.end())
文件“/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py”,第357行,原始解码
从None引发JSONDecodeError(“预期值”,s,err.value)
json.decoder.JSONDecodeError:期望值:第1行第4067列(字符4066)

非常感谢您的帮助。

您的JSON数据在
request\u post.content
中有两个问题:

  • 这里有JS日期对象值,例如:

    "Date":new Date(Date.UTC(2016,1,26,22,0,0,0))
    
  • 结尾有不需要的字符:
    /*“

让我们清理JSON数据,以便可以加载
JSON

from datetime import datetime

data = request_post.text

def convert_date(match):
    return '"' + datetime(*map(int, match.groups())).strftime("%Y-%m-%dT%H:%M:%S") + '"'

data = re.sub(r"new Date\(Date\.UTC\((\d+),(\d+),(\d+),(\d+),(\d+),(\d+),(\d+)\)\)",
              convert_date,
              data)

data = data.strip(";/*")
data = json.loads(data)

context['Observations'] = data

在您向我们展示的代码中,您没有定义变量
html
。它在哪里?您好,谢谢您的回答。这一行cityDict=json.loads(str(html))当我试图访问它时,很多代码都是在shell中完成的。如果成功的话,我可以把它放到django中。我试图理解我做错了什么。指向预期json的链接是@SlangI'mmatalk sure,添加了一个import语句。感谢了解日期时间:from django.utils.timezone import datetime。