Python线程正在退出

Python线程正在退出,python,multithreading,amazon-web-services,amazon-s3,Python,Multithreading,Amazon Web Services,Amazon S3,我有一个sqs文件队列,我想从s3下载这些文件,这样我就可以提取文本,但是当我试图打开s3存储桶时,代码似乎总是没有错误地退出 class MultiThread: def __init__(self): conn = boto.sqs.connect_to_region("us-west-2", aws_access_key_id=ACCESS_KEY,

我有一个sqs文件队列,我想从s3下载这些文件,这样我就可以提取文本,但是当我试图打开s3存储桶时,代码似乎总是没有错误地退出

class MultiThread:

    def __init__(self):
        conn = boto.sqs.connect_to_region("us-west-2",
                                      aws_access_key_id=ACCESS_KEY,
                                      aws_secret_access_key=SECRET_KEY)
        self.sqs_q = conn.get_queue(QUEUE_NAME)
        self.count = 0

    def start(self, num_threads):

        for i in xrange(num_threads):
            t = threading.Thread(target=self.run, args=(self.do_work,))
            t.start()


    def run(self, func):
        while self.sqs_q.count() > 0:
            try:
                rs = self.sqs_q.get_messages()
                m = rs[0]
                msg = m.get_body()
                func(msg)
                self.sqs_q.delete_message(m)
            except:
                print "empty"

    def do_work(self, file_name):
        doc = DocsScrapper(file_name)
        text = doc.get_text()


class DocScrapper:

    def __init__(self, file_name):
        self.file_name = file_name
        conn = boto.connect_s3(aws_access_key_id=ACCESS_KEY,
                               aws_secret_access_key=SECRET_KEY)

        bucket = conn.get_bucket('courtspider')
        doc_key = bucket.get_key(file_name)
        doc_key.get_contents_to_filename('doc/' + file_name)

    def get_text(self):
        doc_file = open('doc/' + self.file_name, 'rb')
        txt_doc = TextExtractor(doc_file)
        text = txt_doc.pdf_to_text()
        doc_file.close()
        os.remove('doc/' + self.file_name)
        return text

是的,s3存储桶和文件是存在的。

你怎么知道线程何时结束?@DanielSanchez如果我运行5个线程,它就像一个单线程应用程序一样运行。当它点击
threading.Thread(target=self.run,args=(self.do\u work,)
时,它调用run(),而不创建n个线程,然后当我使用
boto.connect\u s3
方法,如
get\u bucket()
时失败。之后,它返回start()并创建另一个线程,并重复此流程,以获得我传递的线程数。也许是一个无限循环?