Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/304.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用python和json的instagram scrape_Python_Web Scraping_Instagram - Fatal编程技术网

使用python和json的instagram scrape

使用python和json的instagram scrape,python,web-scraping,instagram,Python,Web Scraping,Instagram,我知道和时间一起使用游泳池是没有用的。睡眠,但我只是想看看它是如何工作的:) 我读过那些Instagram API,但我只是为了个人项目而下载了scrapr Instagram 我遇到的问题是,我得到了一个短代码列表,我用这段代码写了大约10篇文章,只得到了json或place。但如果我尝试使用这整段代码,我只会得到整段空白,为什么它会返回空结果 import requests import time import json from multiprocessing import Pool w

我知道和时间一起使用游泳池是没有用的。睡眠,但我只是想看看它是如何工作的:)

我读过那些Instagram API,但我只是为了个人项目而下载了scrapr Instagram

我遇到的问题是,我得到了一个短代码列表,我用这段代码写了大约10篇文章,只得到了json或place。但如果我尝试使用这整段代码,我只会得到整段空白,为什么它会返回空结果

import requests
import time
import json
from multiprocessing import Pool

with open('posts.json', 'r') as f:
    arr = json.loads(f.read())  # load json data from previous step
links = []
locations = []

def get_short_code(arr):
    for item in arr:
        shortcode = item['shortcode']
        links.append(shortcode)
    return links

def get_locations(shortcode):
    link = "https://www.instagram.com/p/{0}/?__a=1".format(shortcode)
    r = requests.get(link)
    data = json.loads(r.text)

    try:
        location_name = data["graphql"]["shortcode_media"]["location"]["name"]# get location for a post
        location_city = data['graphql']['shortcode_media']['location']['cityname'].split(',')[0]

    except :
        location_city =''
        location_name =''

    print(location_name)

    locations.append({'shortcode': shortcode, 'location_name': location_name, 'location_city': location_city})
    time.sleep(3)
    return locations

if __name__=='__main__':
    pool = Pool(processes=2)
    pool.map(get_locations, get_short_code(arr))
    if len(locations)%10 == 1:
        with open('locations.json', 'w') as outfile:
            json.dump(locations, outfile)  # save to json