Python 将列表元素合并到Pandas中的一个数据帧列表中

Python 将列表元素合并到Pandas中的一个数据帧列表中,python,pandas,Python,Pandas,我有一个从csv文件读取的数据帧,它类似于以下内容: LIST-1 LIST-2 LIST-3 ... LIST-N TIME 2017-06-21 00:17:00 NaN [99.221] [42.357, 102.665] 2017-06-21 00:18:00 NaN [50.89]

我有一个从csv文件读取的数据帧,它类似于以下内容:

                    LIST-1  LIST-2        LIST-3              ... LIST-N
TIME                                           
2017-06-21 00:17:00 NaN     [99.221]       [42.357, 102.665]
2017-06-21 00:18:00 NaN     [50.89]        [42.357, 43.125,...]
2017-06-21 00:19:00 NaN     [61.50, 76.1]  [70.163, 121.486] 
2017-06-21 00:20:00 [70.16] NaN            NaN
2017-06-21 00:21:00 NaN     [102.665]      [57.9, 63.66, 68.7... 
每行代表一分钟的数据,列表列的数据类型为object。我想做以下工作:

  • 将每行中的所有列表合并到一个浮动列表中,并将该列表用作一个新列--
    all_list
  • 然后,将30分钟的数据(即30行数据--
    ALL_LIST
    )合并到一个新列表中
  • 最后,我想得到如下数据帧:

    TIME LIST 2017-06-21 00:00:00 [99.221,42.357, 42.357, ...] 2017-06-21 00:30:00 [52.328,42.357, 49.169, ...] 2017-06-21 01:00:00 [61.484,42.357, 76.52, ...] 2017-06-21 01:30:00 [76.523,42.357, 121.486, ...] 时间表 2017-06-21 00:00:00 [99.221,42.357, 42.357, ...] 2017-06-21 00:30:00 [52.328,42.357, 49.169, ...] 2017-06-21 01:00:00 [61.484,42.357, 76.52, ...] 2017-06-21 01:30:00 [76.523,42.357, 121.486, ...]

  • 我为我的问题找到了一个解决办法。我会把它写出来,希望看看它是否能提高性能

        all_tt_list['ALL_LIST'] = all_tt_list.apply(lambda x: ','.join(x.dropna()), axis=1)
        all_tt_list['ALL_LIST'] = all_tt_list['ALL_LIST'].astype(str).str.replace('[', '')
        all_tt_list['ALL_LIST'] = all_tt_list['ALL_LIST'].astype(str).str.replace(']', '')
        all_tt_list['ALL_LIST'] = all_tt_list['ALL_LIST'].astype(str).str.split(',')
        WAIT_TIME_INTERVAL = 30*60
        rng = pd.date_range(date, periods=(24 * 60 * 60 / WAIT_TIME_INTERVAL) + 1, freq=str(WAIT_TIME_INTERVAL) + 'S',
                        tz='Asia/Shanghai')
        for k in range(len(rng)):
    
            if(k == (len(rng)-1)):
                continue
    
            period_start = rng[k]
            period_end = rng[k+1]
            period_df = all_tt_list[all_tt_list.index > period_start]
            period_df = period_df[period_df.index < period_end]
    
            period_tt_list = period_df['ALL_LIST'].tolist()
            import itertools
    
            period_merged = list(itertools.chain.from_iterable(period_tt_list))
    
            period_merged_s = pd.DataFrame(period_merged, columns=['TT_NUM']).astype(float).astype(int)
    
    all_tt_list['all_list']=all_tt_list.apply(lambda x:','.join(x.dropna()),axis=1)
    all_tt_list['all_list']=all_tt_list['all_list'].astype(str).str.replace('[','')
    all_tt_list['all_list']=all_tt_list['all_list'].astype(str).str.replace(']','')
    all_tt_list['all_list']=all_tt_list['all_list'].astype(str).str.split(','))
    等待时间间隔=30*60
    rng=pd.日期范围(日期、时段=(24*60*60/等待时间间隔)+1,频率=str(等待时间间隔)+S,
    (亚洲/上海)
    对于范围内的k(len(rng)):
    如果(k==(len(rng)-1)):
    持续
    周期\u开始=rng[k]
    周期结束=rng[k+1]
    时段df=所有时段列表[所有时段列表.索引>时段开始]
    period_df=period_df[period_df.index