Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/277.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Python提取Outlook电子邮件数据时出错_Python_Pandas_Email_Outlook_Win32com - Fatal编程技术网

使用Python提取Outlook电子邮件数据时出错

使用Python提取Outlook电子邮件数据时出错,python,pandas,email,outlook,win32com,Python,Pandas,Email,Outlook,Win32com,我有一个Python脚本,它使用os.walk和win32com.client从C:/drive上的文件夹及其子文件夹中提取Outlook电子邮件文件.msg中的信息。它似乎可以工作,但当我尝试对返回的数据帧(如emailData.head)执行任何操作时,Python崩溃。由于权限错误,我也无法将数据帧写入.csv 我想知道我的代码是否没有正确地关闭outlook/每条邮件,这就是导致问题的原因?任何帮助都将不胜感激 import os import win32com.client impor

我有一个Python脚本,它使用os.walk和win32com.client从C:/drive上的文件夹及其子文件夹中提取Outlook电子邮件文件.msg中的信息。它似乎可以工作,但当我尝试对返回的数据帧(如emailData.head)执行任何操作时,Python崩溃。由于权限错误,我也无法将数据帧写入.csv

我想知道我的代码是否没有正确地关闭outlook/每条邮件,这就是导致问题的原因?任何帮助都将不胜感激

import os
import win32com.client
import pandas as pd

# initialize Outlook client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")

# set input directory (where the emails are) and output directory (where you
# would like the email data saved)
inputDir = 'C:/Users/.../myFolderPath'
outputDir = 'C:/Users/.../myOutputPath'


def emailDataCollection(inputDir,outputDir):
    """ This function loops through an input directory to find
    all '.msg' email files in all folders and subfolders in the
    directory, extracting information from the email into lists,
    then converting the lists to a Pandas dataframe before exporting
    to a '.csv' file in the output directory
    """
    # Initialize lists
    msg_Path = []
    msg_SenderName = []
    msg_SenderEmailAddress = []
    msg_SentOn = []
    msg_To = []
    msg_CC = []
    msg_BCC = []
    msg_Subject = []
    msg_Body = []
    msg_AttachmentCount = []

    # Loop through the directory
    for root, dirnames, filenames in os.walk(inputDir):
        for filename in filenames:
            if filename.endswith('.msg'): # check to see if the file is an email
                filepath = os.path.join(root,filename) # save the full filepath
                # Extract email data into lists
                msg = outlook.OpenSharedItem(filepath)
                msg_Path.append(filepath)
                msg_SenderName.append(msg.SenderName)
                msg_SenderEmailAddress.append(msg.SenderEmailAddress)
                msg_SentOn.append(msg.SentOn)
                msg_To.append(msg.To)
                msg_CC.append(msg.CC)
                msg_BCC.append(msg.BCC)
                msg_Subject.append(msg.Subject)
                msg_Body.append(msg.Body)
                msg_AttachmentCount.append(msg.Attachments.Count)
                del msg

    # Convert lists to Pandas dataframe
    emailData = pd.DataFrame({'Path' : msg_Path,
                          'SenderName' : msg_SenderName,
                          'SenderEmailAddress' : msg_SenderEmailAddress,
                          'SentOn' : msg_SentOn,
                          'To' : msg_To,
                          'CC' : msg_CC,
                          'BCC' : msg_BCC,
                          'Subject' : msg_Subject,
                          'Body' : msg_Body,
                          'AttachmentCount' : msg_AttachmentCount
    }, columns=['Path','SenderName','SenderEmailAddress','SentOn','To','CC',
            'BCC','Subject','Body','AttachmentCount'])


    return(emailData)


# Call the function
emailData = emailDataCollection(inputDir,outputDir)

# Causes Python to crash
emailData.head()
# Fails due to permission error
emailData.to_csv(outputDir,header=True,index=False)

希望这不会太晚,但我设法找到了问题的根源:

由于来自msg_SentOn的日期时间数据,内核崩溃。如果您检查msg_SentOn中的数据类型,它将被分类为pywintype.datetime,这与pandas不兼容

您需要将msg_SentOn中的元素转换为datetime.datetime格式


这里的源代码非常有用:

当我运行批量电子邮件时,我会得到AttributeError:OpenSharedItem.SenderName。代码在5封和10封电子邮件中尝试的有限电子邮件上运行得非常好