在python中实现基本队列/线程进程_Python_Multithreading_Queue_Parallel Processing_Simultaneous

在python中实现基本队列/线程进程

python multithreading parallel-processing

在python中实现基本队列/线程进程,python,multithreading,queue,parallel-processing,simultaneous,Python,Multithreading,Queue,Parallel Processing,Simultaneous,寻找一些眼球来验证下面的psuedo-python是否有意义。我希望生成大量线程，以尽可能快地实现一些inproc函数。其想法是在主循环中生成线程，因此应用程序将以并行/并发方式同时运行线程 chunk of code -get the filenames from a dir -write each filename ot a queue -spawn a thread for each filename, where each thread waits/reads value/d

寻找一些眼球来验证下面的psuedo-python是否有意义。我希望生成大量线程，以尽可能快地实现一些inproc函数。其想法是在主循环中生成线程，因此应用程序将以并行/并发方式同时运行线程

chunk of code
 -get the filenames from a dir
 -write each filename ot a queue
 -spawn a thread for each filename, where each thread 
  waits/reads value/data from the queue
 -the threadParse function then handles the actual processing 
  based on the file that's included via the "execfile" function...


# System modules
from Queue import Queue
from threading import Thread
import time

# Local modules
#import feedparser

# Set up some global variables
appqueue = Queue()

# more than the app will need
# this matches the number of files that will ever be in the 
# urldir
#
num_fetch_threads = 200


def threadParse(q)
  #decompose the packet to get the various elements
  line = q.get()
  college,level,packet=decompose (line)

  #build name of included file
  fname=college+"_"+level+"_Parse.py"
  execfile(fname)
  q.task_done()


#setup the master loop
while True
  time.sleep(2)
  # get the files from the dir
  # setup threads
  filelist="ls /urldir"
  if filelist
    foreach file_ in filelist:
        worker = Thread(target=threadParse, args=(appqueue,))
        worker.start()

    # again, get the files from the dir
    #setup the queue
    filelist="ls /urldir"
    foreach file_ in filelist:
       #stuff the filename in the queue
       appqueue.put(file_)


    # Now wait for the queue to be empty, indicating that we have
    # processed all of the downloads.

  #don't care about this part

  #print '*** Main thread waiting'
  #appqueue.join()
  #print '*** Done'

感谢您的想法/意见/建议

如果我没有弄错的话，谢谢你：为了更快地完成任务，你产生了很多线程

只有在每个线程中完成的工作的主要部分在不持有GIL的情况下完成时，这才有效。所以，如果有大量的数据等待来自网络、磁盘或类似的东西，这可能是一个好主意。如果每个任务都使用大量CPU，那么它的运行方式与单核单CPU机器非常相似，您也可以按顺序执行这些任务

我应该补充一点，我所写的内容适用于CPython，但不一定适用于Jython/IronPython。另外，我应该补充一点，如果你需要使用更多的CPU/内核，有一个模块可能会有所帮助