Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/343.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将带有日期值的列表加载到数据框中,并随时间绘制活动_Python_Pandas_Time Series - Fatal编程技术网

Python 将带有日期值的列表加载到数据框中,并随时间绘制活动

Python 将带有日期值的列表加载到数据框中,并随时间绘制活动,python,pandas,time-series,Python,Pandas,Time Series,我有一些推特数据,我想根据推特类型(推特/提及/转发)绘制活动超时 数据当前加载到元组列表中,其中包含日期和类型: time = [('2014-04-13', 'tweet'), ('2014-04-13', 'tweet'), ('2014-04-13', 'mention'), ('2014-04-13', 'retweet'), ('2014-04-13', 'mention'), ('2014-04-13'

我有一些推特数据,我想根据推特类型(推特/提及/转发)绘制活动超时

数据当前加载到元组列表中,其中包含日期和类型:

time = [('2014-04-13', 'tweet'),
        ('2014-04-13', 'tweet'),
        ('2014-04-13', 'mention'),
        ('2014-04-13', 'retweet'),
        ('2014-04-13', 'mention'),
        ('2014-04-13', 'tweet'),
        ('2014-04-13', 'retweet'),
        ('2014-04-13', 'mention'),
        ('2014-04-13', 'tweet'),
        ('2014-04-13', 'retweet'),
        ('2014-04-13', 'retweet'),
        ('2014-04-13', 'mention'),
        ('2014-04-13', 'tweet'),
        ('2014-04-13', 'tweet'),
        ('2014-04-13', 'tweet'),
        ('2014-04-13', 'tweet'),
        ('2014-04-13', 'mention'),
        ('2014-04-13', 'retweet'),
        ('2014-04-13', 'mention'),
        ('2014-04-13', 'tweet')]
我已将数据加载到熊猫数据框中:

time_df = pd.DataFrame(time, columns=['date','time'])
         date     time
0  2014-04-13    tweet
1  2014-04-13    tweet
2  2014-04-13  mention
3  2014-04-13  retweet
4  2014-04-13  mention
...
...
...
现在数据如下所示:

time_df = pd.DataFrame(time, columns=['date','time'])
         date     time
0  2014-04-13    tweet
1  2014-04-13    tweet
2  2014-04-13  mention
3  2014-04-13  retweet
4  2014-04-13  mention
...
...
...
然而,现在当涉及到随时间绘制这些数据时,我迷失了方向。另外,我想将每种类型(tweet/提及/转发)划分为不同的颜色线。我还应该注意,有时我可能需要按天/周/月聚合数据

理想情况下,我希望我的情节与以下情节相似,除了推特、提及、转发:

time_df = pd.DataFrame(time, columns=['date','time'])
         date     time
0  2014-04-13    tweet
1  2014-04-13    tweet
2  2014-04-13  mention
3  2014-04-13  retweet
4  2014-04-13  mention
...
...
...

所以,我想我理解你需要做什么,即使你的问题中没有明确说明

请允许我模拟一些数据:

import numpy as np
import pandas
import random

tweet_types = ['tweet', 'retweet', 'mention']
index = pandas.DatetimeIndex(freq='5min', start='2014-04-13', end='2014-05-13')
tweets = [random.choice(tweet_types) for _ in range(len(index))]
time_df = pandas.DataFrame(index=index, data=tweets, columns=['tweet type'])
time_df['day'] = time_df.index.date
time_df['count'] = 1
print(time_df.head())
因此,前几行现在如下所示:

                     tweet type         day  count
2014-04-13 00:00:00     mention  2014-04-13      1
2014-04-13 00:05:00     mention  2014-04-13      1
2014-04-13 00:10:00       tweet  2014-04-13      1
2014-04-13 00:15:00       tweet  2014-04-13      1
2014-04-13 00:20:00     retweet  2014-04-13      1
我添加了
count
值,因为我们需要为我们的每日汇总做一些合计,在这里完成:

daily_counts = time_df.groupby(by=['tweet type', 'day']).count()
daily_counts_xtab = daily_counts.unstack(level='tweet type')['count']
print(daily_counts_xtab.head())
这给了我们

tweet type  mention  retweet  tweet
day                                
2014-04-13       89      101     98
2014-04-14       98      113     77
2014-04-15       87      103     98
2014-04-16       81      107    100
2014-04-17       96       92    100
那么

daily_counts_xtab.plot()
给我: