Python 设计一个应限制功能,限制特定时间窗口内的请求

Python 设计一个应限制功能,限制特定时间窗口内的请求,python,algorithm,Python,Algorithm,在一家大型科技巨头的面谈中,我被问到了这个算法问题。我没能很好地解决它,从那以后它一直困扰着我。这就是问题所在,也是我试图解决的问题 Question: Design a should_throttle function that takes in a stream of requests. Each request has the following format: {'client_id', 'msg_id', 'timestamp'}. @return true if client has

在一家大型科技巨头的面谈中,我被问到了这个算法问题。我没能很好地解决它,从那以后它一直困扰着我。这就是问题所在,也是我试图解决的问题

Question:
Design a should_throttle function that takes in a stream of requests.
Each request has the following format: {'client_id', 'msg_id', 'timestamp'}.
@return true if client has exceeded 1000 requests in one minute
@return false if client has NOT exceeded 1000 requests in one minute.
我尝试的解决方案:

def应设置节流阀(消息,秒数限制=60,计数限制=1000):
timestamp=msg['timestamp']
client_id=msg['client_id']
#msg_id=msg['id']
全局dict={}键是时间戳,值是客户端列表
#我们只关心最后60秒
#因此,获取当前所见关键点最后60秒内的所有关键点
左范围=时间戳-datetime.timedelta(秒=秒限制)
右\u范围=时间戳
按键至按键删除=[]
客户端计数=0
对于全局_dict.keys()中的键:#O(60)
如果键=左\u范围,键=计数\u限制:
返回真值#限制它
返回错误
我在一个时间窗口内跟踪每个请求的频率,并清理dict中窗口外的条目

我认为BigOh的分析是:

n = max seconds window
m = max client limit

==> O(n * m)
有什么建议/改进吗?我真的觉得我错过了一个更好的方法。

您可以创建一个by客户端,将它们存储在defaultdict中

然后,对于每个传入消息,
popleft
从deque中获取所有太旧的时间戳,添加新的时间戳,并检查是否超过了限制

from collections import defaultdict,deque


def should_throttle(messages, seconds_limit=60, count_limit=1000):  
    recent = defaultdict(deque)

    for message in messages:
        timestamp = message['timestamp']
        client = message['client_id']
        # remove the timestamps that are more than seconds_limit old
        try:
            while recent[client][0] < timestamp - seconds_limit:
                recent[client].popleft()
        except IndexError:
            # deque was empty
            pass

        # append the current one
        recent[client].append(timestamp)

        # just to show the current state of affairs...
        print(client, recent[client])

        # has this client exceeded his limit?
        if len(recent[client]) > count_limit:
            return True

    # we never exceeded the count limit
    return False
输出:

1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

True
1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

False
有更高的限制:

should_throttle([{'client_id': 1, 'msg_id':'', 'timestamp':1},
                {'client_id': 2, 'msg_id':'', 'timestamp':20},
                {'client_id': 1, 'msg_id':'', 'timestamp':21},
                {'client_id': 1, 'msg_id':'', 'timestamp':22},
                {'client_id': 2, 'msg_id':'', 'timestamp':23},
                {'client_id': 1, 'msg_id':'', 'timestamp':24},
               ],
               seconds_limit=4, count_limit=4)
输出:

1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

True
1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

False
您可以创建一个by客户端,将它们存储在defaultdict中

然后,对于每个传入消息,
popleft
从deque中获取所有太旧的时间戳,添加新的时间戳,并检查是否超过了限制

from collections import defaultdict,deque


def should_throttle(messages, seconds_limit=60, count_limit=1000):  
    recent = defaultdict(deque)

    for message in messages:
        timestamp = message['timestamp']
        client = message['client_id']
        # remove the timestamps that are more than seconds_limit old
        try:
            while recent[client][0] < timestamp - seconds_limit:
                recent[client].popleft()
        except IndexError:
            # deque was empty
            pass

        # append the current one
        recent[client].append(timestamp)

        # just to show the current state of affairs...
        print(client, recent[client])

        # has this client exceeded his limit?
        if len(recent[client]) > count_limit:
            return True

    # we never exceeded the count limit
    return False
输出:

1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

True
1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

False
有更高的限制:

should_throttle([{'client_id': 1, 'msg_id':'', 'timestamp':1},
                {'client_id': 2, 'msg_id':'', 'timestamp':20},
                {'client_id': 1, 'msg_id':'', 'timestamp':21},
                {'client_id': 1, 'msg_id':'', 'timestamp':22},
                {'client_id': 2, 'msg_id':'', 'timestamp':23},
                {'client_id': 1, 'msg_id':'', 'timestamp':24},
               ],
               seconds_limit=4, count_limit=4)
输出:

1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

True
1 deque([1])
2 deque([20])
1 deque([21])
1 deque([21, 22])
2 deque([20, 23])
1 deque([21, 22, 24])

False

我想知道这是否更适合?哦,我不知道那页,下次会记住的。很抱歉。我想知道这是否更适合?哦,我不知道那页,下次会记住的。很抱歉。