在Python Dash中将上载的文本文件转换为数据帧

在Python Dash中将上载的文本文件转换为数据帧,python,plotly-dash,Python,Plotly Dash,我正在探索dash以构建用于日志分析的仪表板。我在Jupyter笔记本上做了分析,但在Dash上很难做出同样的回应。经过数小时的研究,我不知道如何在Dash中将文本文件转换为数据帧 我需要在Dash中执行以下操作 我在Dash文档中找到了一些示例,在这些示例中,我们可以导入xlsx或csv文件并将其转换为数据表,但我发现它们在文本文件的上下文中没有多大帮助 app.layout = html.Div([ dcc.Upload( id='upload-data',

我正在探索dash以构建用于日志分析的仪表板。我在Jupyter笔记本上做了分析,但在Dash上很难做出同样的回应。经过数小时的研究,我不知道如何在Dash中将文本文件转换为数据帧

  • 我需要在Dash中执行以下操作
我在Dash文档中找到了一些示例,在这些示例中,我们可以导入xlsx或csv文件并将其转换为数据表,但我发现它们在文本文件的上下文中没有多大帮助

app.layout = html.Div([
    dcc.Upload(
        id='upload-data',
        children=html.Div([
            'Drag and Drop or ',
            html.A('Select Files')
        ]),
        style={
            'width': '100%',
            'height': '60px',
            'lineHeight': '60px',
            'borderWidth': '1px',
            'borderStyle': 'dashed',
            'borderRadius': '5px',
            'textAlign': 'center',
            'margin': '10px'
        },
        # Allow multiple files to be uploaded
        multiple=True
    ),
    html.Div(id='output-data-upload'),
])


def parse_contents(contents, filename, date):
    content_type, content_string = contents.split(',')

    decoded = base64.b64decode(content_string)
    
    try:
        if 'csv' in filename:
            # Assume that the user uploaded a CSV file
            
            df = pd.read_csv(
                io.StringIO(decoded.decode('utf-8')))
        elif 'xls' in filename:
            # Assume that the user uploaded an excel file
            df = pd.read_excel(io.BytesIO(decoded))
    
    except Exception as e:
        print(e)
        return html.Div([
            'There was an error processing this file.'
        ])

    return html.Div([
        html.H5(filename),
        html.H6(datetime.datetime.fromtimestamp(date)),

        dash_table.DataTable(
            data=df.to_dict('records'),
            columns=[{'name': i, 'id': i} for i in df.columns]
        ),

        html.Hr(),  # horizontal line

        # For debugging, display the raw contents provided by the web browser
        html.Div('Raw Content'),
        html.Pre(contents[0:200] + '...', style={
            'whiteSpace': 'pre-wrap',
            'wordBreak': 'break-all'
        })
    ])


@app.callback(Output('output-data-upload', 'children'),
              [Input('upload-data', 'contents')],
              [State('upload-data', 'filename'),
               State('upload-data', 'last_modified')])


def update_output(list_of_contents, list_of_names, list_of_dates):
    if list_of_contents is not None:
        children = [
            parse_contents(c, n, d) for c, n, d in
            zip(list_of_contents, list_of_names, list_of_dates)]
        return children

有谁能帮我/为我指出正确的方向,告诉我如何导入文本文件,然后在其中找到所需的模式,并像我在Jupyter中所做的那样转换为数据帧。

如果上传的文件已经以标准方式格式化,例如CSV,那么您给出的示例就是所需的全部内容。如果处理的是非结构化文件,则可以从同一位置开始,但必须替换以下行:

df=pd.read\u csv(io.StringIO(decoded.decode('utf-8'))

使用您自己的代码,从解码数据获取格式化数据帧。看起来您可能已经有了一些代码,因此应该找到对上载文件进行解码的点,例如
decoded=base64。b64 decode(content\u string)
,这对于您将内容传递到自己的代码来说是最方便的。您可以直接执行此操作,但我尚未尝试确认:

Log=io.StringIO(decoded.decode('utf-8')).readlines()

app.layout = html.Div([
    dcc.Upload(
        id='upload-data',
        children=html.Div([
            'Drag and Drop or ',
            html.A('Select Files')
        ]),
        style={
            'width': '100%',
            'height': '60px',
            'lineHeight': '60px',
            'borderWidth': '1px',
            'borderStyle': 'dashed',
            'borderRadius': '5px',
            'textAlign': 'center',
            'margin': '10px'
        },
        # Allow multiple files to be uploaded
        multiple=True
    ),
    html.Div(id='output-data-upload'),
])


def parse_contents(contents, filename, date):
    content_type, content_string = contents.split(',')

    decoded = base64.b64decode(content_string)
    
    try:
        if 'csv' in filename:
            # Assume that the user uploaded a CSV file
            
            df = pd.read_csv(
                io.StringIO(decoded.decode('utf-8')))
        elif 'xls' in filename:
            # Assume that the user uploaded an excel file
            df = pd.read_excel(io.BytesIO(decoded))
    
    except Exception as e:
        print(e)
        return html.Div([
            'There was an error processing this file.'
        ])

    return html.Div([
        html.H5(filename),
        html.H6(datetime.datetime.fromtimestamp(date)),

        dash_table.DataTable(
            data=df.to_dict('records'),
            columns=[{'name': i, 'id': i} for i in df.columns]
        ),

        html.Hr(),  # horizontal line

        # For debugging, display the raw contents provided by the web browser
        html.Div('Raw Content'),
        html.Pre(contents[0:200] + '...', style={
            'whiteSpace': 'pre-wrap',
            'wordBreak': 'break-all'
        })
    ])


@app.callback(Output('output-data-upload', 'children'),
              [Input('upload-data', 'contents')],
              [State('upload-data', 'filename'),
               State('upload-data', 'last_modified')])


def update_output(list_of_contents, list_of_names, list_of_dates):
    if list_of_contents is not None:
        children = [
            parse_contents(c, n, d) for c, n, d in
            zip(list_of_contents, list_of_names, list_of_dates)]
        return children