在Python中运行长进程时为用户打印消息
我有一个熊猫操作需要很长时间,因为xlsx文件非常大,无法导入到数据帧中。我想通知用户,当任务正在运行时,他必须等待,但我无法做到这一点。以下是我的功能:在Python中运行长进程时为用户打印消息,python,multithreading,pandas,Python,Multithreading,Pandas,我有一个熊猫操作需要很长时间,因为xlsx文件非常大,无法导入到数据帧中。我想通知用户,当任务正在运行时,他必须等待,但我无法做到这一点。以下是我的功能: def create_list_of_data(): list_data_all = [] list_files_xlsx_f = create_list_of_xlsx() for xls_files in list_files_xlsx_f: df = pandas.read_ex
def create_list_of_data():
list_data_all = []
list_files_xlsx_f = create_list_of_xlsx()
for xls_files in list_files_xlsx_f:
df = pandas.read_excel(xls_files)
df = df[["COL1", "COL2"]]
list_data = df.values.tolist()
list_data_all.extend(list_data)
return list_data_all
我尝试的是使用线程:
import itertools
import threading
import time
import sys
#here is the animation
def animate():
for c in itertools.cycle(['|', '/', '-', '\\']):
if done:
break
sys.stdout.write('\rloading ' + c)
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write('\rDone! ')
def create_list_of_data():
list_data_all = []
list_files_xlsx_f = create_list_of_xlsx()
for xls_files in list_files_xlsx_f:
done = False
t = threading.Thread(target=animate)
t.start()
df = pandas.read_excel(xls_files)
done = True
df = df[["COL1", "COL2"]]
list_data = df.values.tolist()
list_data_all.extend(list_data)
return list_data_all
我的问题是,从animate函数中不知道“done”变量。也许这不是正确的方法。有什么想法吗
函数create_list_of_data()是从另一个PySide按钮启动的
文件。您可以使用多处理而不是线程,并在单独的进程中启动函数动画和创建数据列表。 这样的办法应该行得通
import time
import multiprocessing as mp
def countdown(seconds, message):
while seconds:
mins, secs = divmod(int(seconds), 60)
timeformat = '{:02d}:{:02d} '.format(mins, secs)
print(timeformat + message, end='\r')
time.sleep(1)
seconds -= 1
def slowFunction(seconds):
print('Begin')
time.sleep(seconds)
print('Done')
q = mp.Queue()
countdownProcess = mp.Process(target=countdown, args=(10, 'loading...'))
slowProcess = mp.Process(target=slowFunction, args=(10, ))
countdownProcess.start()
slowProcess.start()
countdownProcess.join()
slowProcess.join()
如果使用对象包装布尔值,则可以通过引用而不是通过值传递
import itertools
import threading
import time
import sys
#here is the animation
def animate(holder):
for c in itertools.cycle(['|', '/', '-', '\\']):
if holder.done:
break
sys.stdout.write('\rloading ' + c)
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write('\rDone! ')
def create_list_of_data():
list_data_all = []
class Holder(object):
done = False
holder = Holder()
t = threading.Thread(target=animate, args=(holder,))
t.start()
time.sleep(10) #Simulating long job
holder.done = True
return list_data_all
我稍微修改了这个示例,以便在不使用额外函数的情况下运行它。您应该在函数外部定义“done”,使其成为全局变量。这样,两个功能都可以访问它。试试这个:
import itertools
import threading
import time
import sys
done = False
#here is the animation
def animate():
for c in itertools.cycle(['|', '/', '-', '\\']):
if done:
break
sys.stdout.write('\rloading ' + c)
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write('\rDone! ')
def create_list_of_data():
list_data_all = []
list_files_xlsx_f = create_list_of_xlsx()
for xls_files in list_files_xlsx_f:
done = False
t = threading.Thread(target=animate)
t.start()
df = pandas.read_excel(xls_files)
done = True
df = df[["COL1", "COL2"]]
list_data = df.values.tolist()
list_data_all.extend(list_data)
return list_data_all
基本上,您需要在其他线程处于活动状态时显示
加载*
:
import sys
import time
import itertools
import threading
def long_process():
time.sleep(5)
thread = threading.Thread(target=long_process)
thread.start()
for c in itertools.cycle(['|', '/', '-', '\\']):
sys.stdout.write('\rloading ' + c)
sys.stdout.flush()
time.sleep(0.1)
if not thread.isAlive():
break
sys.stdout.write('\rDone! ')
输出:
您可以使用队列将完成值发送到线程:
import itertools
import threading
from queue import Queue
import time
import sys
#here is the animation
def animate(q):
for c in itertools.cycle(['|', '/', '-', '\\']):
done = q.get()
if done:
break
sys.stdout.write('\rloading ' + c)
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write('\rDone! ')
def create_list_of_data():
list_data_all = []
list_files_xlsx_f = create_list_of_xlsx()
queue = Queue()
for xls_files in range(1,1000):
done = False
queue.put(done)
t = threading.Thread(target=animate, args=(queue,))
t.start()
df = pandas.read_excel(xls_files)
done = True
queue.put(done)
df = df[["COL1", "COL2"]]
list_data = df.values.tolist()
list_data_all.extend(list_data)
return list_data_all
它不起作用,因为我认为函数中的“done”变量和全局“done”变量不一样。它不会捕获“done=True”。它应该可以工作,但我无法从lon_process()获取返回值。谢谢willy1994。它可能会工作,但我还不够好,无法使您的代码适应我的代码。