Python 如何在dataframe中生成列的md5 has
我有一个数据帧df,列为Python 如何在dataframe中生成列的md5 has,python,pandas,Python,Pandas,我有一个数据帧df,列为 Index(['learner_assignment_xid', 'assignment_xid', 'assignment_attempt_xid', 'learner_xid', 'section_xid', 'final_score_unweighted', 'attempt_score_unweighted', 'points_possible_unweighted', 'scored_datetime', 'gradebook_categor
Index(['learner_assignment_xid', 'assignment_xid', 'assignment_attempt_xid',
'learner_xid', 'section_xid', 'final_score_unweighted',
'attempt_score_unweighted', 'points_possible_unweighted',
'scored_datetime', 'gradebook_category_weight', 'status', 'is_deleted',
'is_scorable', 'drop_state', 'is_manual', 'created_datetime',
'updated_datetime'],
dtype='object')
我想向thif-df添加一个名为checksum的新列,它将连接其中一些列并对其进行md5哈希
我正在尝试:
df_gradebook['updated_checksum']=df_gradebook['final_score_unweighted'].astype(str)+df_gradebook['attempt_score_unweighted'].astype(str)+df_gradebook['points_possible_unweighted'].astype(str)+df_gradebook['scored_datetime'].astype(str)+df_gradebook['status'].astype(str)+df_gradebook['is_deleted'].astype(str)+df_gradebook['is_scorable'].astype(str)+df_gradebook['drop_state'].astype(str)+df_gradebook['updated_datetime'].astype(str)
我正在挣扎的部分是散列。连接完成后如何应用md5
我可以在spark scala中这样做:
.withColumn("update_checksum",md5(concat(
$"final_score_unweighted",
$"attempt_score_unweighted",
$"points_possible_unweighted",
$"scored_datetime",
$"status",
$"is_deleted",
$"is_scorable",
$"drop_state",
$"updated_datetime"
)))
想知道如何用python实现md5
df_gradebook['concat']=df_gradebook['final_score_unweighted'].astype(str)+df_gradebook['attempt_score_unweighted'].astype(str)+df_gradebook['points_possible_unweighted'].astype(str)+df_gradebook['scored_datetime'].astype(str)+df_gradebook['status'].astype(str)+df_gradebook['is_deleted'].astype(str)+df_gradebook['is_scorable'].astype(str)+df_gradebook['drop_state'].astype(str)+df_gradebook['updated_datetime'].astype(str)
df_gradebook['digest'] = df_gradebook['concat'].apply(lambda x: hashlib.md5(x.encode()).hexdigest())
不要把每件事都写在一行中,这会使阅读困难。非常感谢!这很有效