Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/328.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python “如何修复”;类型错误:';int';“对象不可编辑”;并发线程中有错误吗?_Python_Python 3.x_Multithreading_Asynchronous_Web Scraping - Fatal编程技术网

Python “如何修复”;类型错误:';int';“对象不可编辑”;并发线程中有错误吗?

Python “如何修复”;类型错误:';int';“对象不可编辑”;并发线程中有错误吗?,python,python-3.x,multithreading,asynchronous,web-scraping,Python,Python 3.x,Multithreading,Asynchronous,Web Scraping,我的目标是抓取一些链接并使用线程更快地完成 当我尝试创建线程时,它会引发TypeError:“int”对象不可编辑 以下是我们的脚本: import requests import pandas import json import concurrent.futures from from collections import Iterable # our profiles that we will scrape profile = ['kaid_32998958430516646085858

我的目标是抓取一些链接并使用线程更快地完成

当我尝试创建线程时,它会引发
TypeError:“int”对象不可编辑

以下是我们的脚本:

import requests
import pandas
import json
import concurrent.futures
from from collections import Iterable

# our profiles that we will scrape
profile = ['kaid_329989584305166460858587','kaid_896965538702696832878421','kaid_1016087245179855929335360','kaid_107978685698667673890057','kaid_797178279095652336786972','kaid_1071597544417993409487377','kaid_635504323514339937071278','kaid_415838303653268882671828','kaid_176050803424226087137783']

# lists of the data that we are going to fill up with each profile
total_project_votes=[]

def scraper(kaid):
    data = requests.get('https://www.khanacademy.org/api/internal/user/scratchpads?casing=camel&kaid={}&sort=1&page=0&limit=40000&subject=all&lang=en&_=190425-1456-9243a2c09af3_1556290764747'.format(kaid))
    sum_votes=[]
    try:
        data=data.json()
        for item in data['scratchpads']:
            try :
                sum_votes=item['sumVotesIncremented']
            except KeyError:
                pass
        sum_votes=map(int,sum_votes) # change all items of the list in integers
        print(isinstance(sum_votes, Iterable)) #to check if it is an iterable element
        print(isinstance(sum_votes, int)) # to check if it is a int element
        sum_votes=list(sum_votes) # transform into a list
        sum_votes=map(abs,sum_votes) # change all items in absolute value
        sum_votes=list(sum_votes) # transform into a list
        sum_votes=sum(sum_votes) # sum all items in the list
        sum_votes=str(sum_votes) # transform into a string
        total_project_votes=sum_votes
    except json.decoder.JSONDecodeError:
        total_project_votes='NA'
    return total_project_votes

with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
    future_kaid = {executor.submit(scraper, kaid): kaid for kaid in profile}
    for future in concurrent.futures.as_completed(future_kaid):
        kaid = future_kaid[future]
        results = future.result()
        # print(results) why printing only one of them and then stops?
        total_project_votes.append(results[0])

# write into a dataframe and print it:
d = {'total_project_votes':total_project_votes}
dataframe = pandas.DataFrame(data=d)
print(dataframe)
我希望得到以下输出:

total_project_votes
0                   0
1                2353
2                  41
3                   0
4                   0
5                  12
6                5529
7                  NA
8                   2
但是我得到了这个错误:

TypeError: 'int' object is not iterable
我真的不明白这个错误是什么意思。我的剧本怎么了?我怎样才能解决它

当我查看回溯时,问题似乎来自于此:
sum\u voces=map(int,sum\u voces)

下面是一些附加信息

回溯:

Traceback (most recent call last):
  File "toz.py", line 91, in <module>
    results = future.result()
  File "C:\Users\*\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\_base.py", line 425, in result
    return self.__get_result()
  File "C:\Users\*\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "C:\Users\*\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "my_scrap.py", line 71, in scraper
    sum_votes=map(int,sum_votes) # change all items of the list in integers
TypeError: 'int' object is not iterable
回溯(最近一次呼叫最后一次):
文件“toz.py”,第91行,在
results=future.result()
结果文件“C:\Users\*\AppData\Local\Programs\Python37-32\lib\concurrent\futures\\u base.py”,第425行
返回self.\u获取\u结果()
文件“C:\Users\*\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\\u base.py”,第384行,位于“获取结果”中
提出自己的意见
文件“C:\Users\*\AppData\Local\Programs\Python37-32\lib\concurrent\futures\thread.py”,第57行,正在运行
结果=self.fn(*self.args,**self.kwargs)
文件“my_scrap.py”,第71行,在scraper中
sum_voces=map(int,sum_voces)#以整数形式更改列表中的所有项目
TypeError:“int”对象不可编辑
我发现了我的错误:

我应该说:
sum_vots.append(项目['sumvotesincreted'])
而不是:
sum_vots=item['sumVotesIncremented']

另外,因为我们这里只有一项:
total\u project\u vows
。我们的元组
结果
只有一项。 这可能会引起一些问题。因为当我们执行
结果[0]
时,它的行为不像列表。 它不会显示整个
total\u project\u vows
,而是显示字符串的第一个字符。(例如,“Hello”变成“H”)。 如果
total\u project\u vows
是一个int对象而不是字符串。这将产生另一个错误。
为了解决这个问题,我需要在tuple
results
中添加另一个对象,然后当您执行
results[0]
时,它实际上就像一个列表。

如果查看整个错误消息,您将准确地找到它发生在哪一行。您可能希望查看分配给
sum\u vots
的值。您提供的每个代码示例都不同。嗯。。。你知道str(sum(list)(map(abs,list)(map(int,sum_voces‘‘)’))做什么吗?尝试将其拆分为多个语句,并检查在哪个步骤中发生了什么。看看什么时候应该是,但实际上是。@zvone edit:我发现当我运行脚本时,有时会得到一个iterable元素True。但有些我不知道。有什么想法吗?当你得到
int
时,检查一下你在HTML中有什么,然后在浏览器中看到这个页面。可能有些页面有不同的结构,您必须使用不同的代码来获取信息。或者服务器发出警告说它不喜欢机器人和脚本。