Python 性能改进-使用Get方法循环_Python_Django_Performance_Rest

Python 性能改进-使用Get方法循环

python django performance rest

Python 性能改进-使用Get方法循环,python,django,performance,rest,Python,Django,Performance,Rest,我已经建立了一个程序来填充数据库，到那时，它已经开始工作了。基本上，程序向我正在使用的应用程序发出请求（通过RESTAPI）返回我想要的数据，然后将其处理为数据库的可接受形式问题是：GET方法使算法太慢，因为我正在访问特定条目的详细信息，所以对于每个条目，我必须发出一个请求。我有将近15000个请求要处理，银行中的每一行都需要1秒的时间是否有任何可能的方法使此请求更快？如何改进此方法的性能？顺便问一下，有什么技巧可以衡量代码的性能吗提前谢谢代码如下： # Retrieving all t

我已经建立了一个程序来填充数据库，到那时，它已经开始工作了。基本上，程序向我正在使用的应用程序发出请求（通过RESTAPI）返回我想要的数据，然后将其处理为数据库的可接受形式

问题是：GET方法使算法太慢，因为我正在访问特定条目的详细信息，所以对于每个条目，我必须发出一个请求。我有将近15000个请求要处理，银行中的每一行都需要1秒的时间

是否有任何可能的方法使此请求更快？如何改进此方法的性能？顺便问一下，有什么技巧可以衡量代码的性能吗

提前谢谢

代码如下：

# Retrieving all the IDs I want to get the detailed info
abc_ids = serializers.serialize('json', modelExample.objects.all(), fields=('id'))
abc_ids = json.loads(abc_ids)
abc_ids_size = len(abc_ids)

# Had to declare this guys right here because in the end of the code I use them in the functions to create and uptade the back
# And python was complaining that I stated before assign. Picked random values for them.
age = 0
time_to_won = 0
data = '2016-01-01 00:00:00'

# First Loop -> Request to the detailed info of ABC
for x in range(0, abc_ids_size):

id = abc_ids[x]['fields']['id']
url = requests.get(
    'https://api.example.com/v3/abc/' + str(
        id) + '?api_token=123123123')

info = info.json()
dealx = dict(info)

# Second Loop -> Picking the info I want to uptade and create in the bank
for key, result in dealx['data'].items():
    # Relevant only for ModelExample -> UPTADE
    if key == 'age':
        result = dict(result)
        age = result['total_seconds']
    # Relevant only For ModelExample -> UPTADE
    elif key == 'average_time_to_won':
        result = dict(result)
        time_to_won = result['total_seconds']

    # Relevant For Model_Example2 -> CREATE
    # Storing a date here to use up foward in a datetime manipulation
    if key == 'add_time':
        data = str(result)

    elif key == 'time_stage':

        # Each stage has a total of seconds that the user stayed in.
        y = result['times_in_stages']
        # The user can be in any stage he want, there's no rule about the order.
        # But there's a record of the order he chose.
        z = result['order_of_stages']

        # Creating a list to fill up with all stages info and use in the bulk_create.
        data_set = []
        index = 0

        # Setting the number of repititions base on the number of the stages in the list.
        for elemento in range(0, len(z)):
            data_set_i = {}
            # The index is to define the order of the stages.
            index = index + 1

            for key_1, result_1 in y.items():
                if int(key_1) == z[elemento]:
                    data_set_i['stage_id'] = int(z[elemento])
                    data_set_i['index'] = int(index)
                    data_set_i['abc_id'] = id

                    # Datetime manipulation
                    if result_1 == 0 and index == 1:
                        data_set_i['add_date'] = data

                    # I know that I totally repeated the code here, I was trying to get this part shorter
                    # But I could not get it right.
                    elif result_1 > 0 and index == 1:
                        data_t = datetime.strptime(data, "%Y-%m-%d %H:%M:%S")
                        data_sum = data_t + timedelta(seconds=result_1)
                        data_sum += timedelta(seconds=3)
                        data_nova = str(data_sum.year) + '-' + str(formaters.DateNine(
                            data_sum.month)) + '-' + str(formaters.DateNine(data_sum.day)) + ' ' + str(
                            data_sum.hour) + ':' + str(formaters.DateNine(data_sum.minute)) + ':' + str(
                            formaters.DateNine(data_sum.second))
                        data_set_i['add_date'] = str(data_nova)

                    else:
                        data_t = datetime.strptime(data_set[elemento - 1]['add_date'], "%Y-%m-%d %H:%M:%S")
                        data_sum = data_t + timedelta(seconds=result_1)
                        data_sum += timedelta(seconds=3)
                        data_nova = str(data_sum.year) + '-' + str(formaters.DateNine(
                            data_sum.month)) + '-' + str(formaters.DateNine(data_sum.day)) + ' ' + str(
                            data_sum.hour) + ':' + str(formaters.DateNine(data_sum.minute)) + ':' + str(
                            formaters.DateNine(data_sum.second))
                        data_set_i['add_date'] = str(data_nova)

                    data_set.append(data_set_i)

Model_Example2_List = [Model_Example2(**vals) for vals in data_set]
Model_Example2.objects.bulk_create(Model_Example2_List)

ModelExample.objects.filter(abc_id=id).update(age=age, time_to_won=time_to_won)

如果瓶颈在您的网络请求中，那么除了使用gzip或deflate但是使用

gzip和deflate传输编码会自动解码以用于你

如果您想双重确认，可以将以下头添加到get请求中

{ 'Accept-Encoding': 'gzip,deflate'}

另一种选择是使用线程，让多个请求并行运行，如果您拥有大量带宽和多个内核，这是一个很好的选择

最后，有很多不同的方法可以分析python，包括使用combo。

有人需要更新API以批量检索信息，而不是一次只检索一项。这将显著提高性能。非常感谢您的回答=）我尝试在请求头中添加gzip和deflate，但没有感觉到任何变化。我将尝试深入了解这两个参数，以了解我是否遗漏了某些内容。但无论如何，并行处理将是目前最好的选择。我想我必须对这个特殊的代码有点耐心。