如果python中的任何作业失败,如何退出该进程?
我正在以基于序列号的并行方式运行作业。我在衡量每一份工作的状态——成功还是失败。然后在获得每个工作的状态后,我会发送带有每个工作状态的邮件 但邮件是在整个过程完成后生成的。但我想,如果任何作业失败,该过程将停止并生成邮件 你能帮我怎么做吗 我正在运行的代码:如果python中的任何作业失败,如何退出该进程?,python,python-3.x,pandas,dataframe,pyspark,Python,Python 3.x,Pandas,Dataframe,Pyspark,我正在以基于序列号的并行方式运行作业。我在衡量每一份工作的状态——成功还是失败。然后在获得每个工作的状态后,我会发送带有每个工作状态的邮件 但邮件是在整个过程完成后生成的。但我想,如果任何作业失败,该过程将停止并生成邮件 你能帮我怎么做吗 我正在运行的代码: df_mail_final = pd.DataFrame() df_mail_final1 = pd.DataFrame() '''Getting the status of every j
df_mail_final = pd.DataFrame()
df_mail_final1 = pd.DataFrame()
'''Getting the status of every job'''
for m_job in df_main4.master_job.unique():
list_df = []
dict_mail = OrderedDict()
temp_df1 = df_main4[df_main4['master_job'] == m_job].copy()
temp_df1['duration'] = pd.to_datetime(temp_df1['end_time'].unique()[-1]) - pd.to_datetime(temp_df1['start_time'].unique()[0])
temp_df1['duration'] = temp_df1['duration'].replace('0 days' ,'')
status_list = temp_df1.status.unique()
if(0 in status_list):
dict_mail['Master Job Name'] = m_job
idx = temp_df1['status'] == 0
dict_mail['Execution_Seq'] = temp_df1.loc[idx]["exec_seq"].unique()[0]
dict_mail['Start_time'] = temp_df1.loc[idx]["start_time"].unique()[0]
dict_mail['End_time'] = temp_df1.loc[idx]["end_time"].unique()[-1]
dict_mail['Status'] = 'Failed'
dict_mail['Duration'] = temp_df1.loc[idx]["duration"].unique()[-1]
dict_mail['Reason'] = temp_df1.loc[idx]["error_msg"].unique()[0]
dict_mail['Function_Name'] = temp_df1.loc[idx]["error_func"].unique()[0]
list_df.append(dict_mail)
df_mail = pd.DataFrame(list_df)
if(0 not in status_list):
print(m_job)
dict_mail['Master Job Name'] = m_job
dict_mail['Execution_Seq'] = temp_df1.exec_seq.unique()[0]
dict_mail['Start_time'] = temp_df1.start_time.unique()[0]
dict_mail['End_time'] = temp_df1.end_time.unique()[-1]
dict_mail['Status'] = 'Success'
dict_mail['Duration'] = temp_df1.duration.unique()[-1]
dict_mail['Reason'] = ''
dict_mail['Function_Name'] = ''
list_df.append(dict_mail)
df_mail = pd.DataFrame(list_df)
df_mail_final = pd.concat([df_mail_final,df_mail], axis=0, ignore_index=True)
#if(df_mail_final['Status'].iloc[-1] == 'Failed'):
#break
'''Printing the Final Dataframe with status of all the jobs'''
print(df_mail_final)
df_mail_final = df_mail_final[['Master Job Name', 'Execution_Seq', 'Start_time', 'End_time', 'Status', 'Duration', 'Reason', 'Function_Name']]
exec_end_dt = datetime.datetime.now().strftime("%H:%M:%S")
#total_duration = pd.to_datetime(exec_end_dt) - pd.to_datetime(exec_start_dt)
total_duration= pd.to_datetime(df_mail_final['End_time']).max() - pd.to_datetime(df_mail_final['Start_time']).min()
total_duration = str(total_duration)
total_duration = total_duration.replace('0 days', '')
send_mail(df_mail_final, LOG_FILE, total_duration)
分享如何实施此工作的要点/系统设计逻辑
def my_parallel_job(*args, **kwargs):
# do your stuff here
pass
def parallel_job_wrapper(*args, **kwargs):
try:
my_parallel_job(*args, **kwargs)
# if errors following will not run
return "success"
except:
# if errors comes
return "fail"
def main(*args, **kwargs):
# call you parallel jobs from here
p1 = parallel_job_wrapper(*args, **kwargs)
# preferably you are using something like python's multithreading pool methods
在上述代码中,第二个功能作为缓冲,以防第一个功能出现故障。这可以确保即使任何并行作业失败,您的
main
也不会停止。邮件触发逻辑应该放在for循环中