Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 给定另一列的条件,如何迭代特定DataFrame列的行?_Python_Pandas_Loops_Dataframe_Tweepy - Fatal编程技术网

Python 给定另一列的条件,如何迭代特定DataFrame列的行?

Python 给定另一列的条件,如何迭代特定DataFrame列的行?,python,pandas,loops,dataframe,tweepy,Python,Pandas,Loops,Dataframe,Tweepy,因此,我基本上想做的是以下内容,基于一个数据框,其中包含列'date'和'polarity',在'date'(days)中有七个不同的值,在'polarity'中的值介于-1和1之间: For each of the seven days: i) count all values in the 'polarity' column that are positive ii) count all values in the 'polarity' column that are negative ii

因此,我基本上想做的是以下内容,基于一个数据框,其中包含列'date'和'polarity',在'date'(days)中有七个不同的值,在'polarity'中的值介于-1和1之间:

For each of the seven days:
i) count all values in the 'polarity' column that are positive
ii) count all values in the 'polarity' column that are negative
iii) count all values in the 'polarity' column for a given day (neg, neutral, pos)
编辑:每天i)-iii)的输出应为整数,存储在列表中

Edit2:我尝试使用以下代码实现它(仅适用于值>0):

但是,这返回了0,这在签入Excel时是错误的

非常感谢您的帮助

干杯,
IG

如果我理解正确,您需要为每一天的极性值计数。 可能是这样的:

positive = df_tweets[df_tweets['polarity'] > 0].groupby('date').count().reset_index()
negative = df_tweets[df_tweets['polarity'] < 0].groupby('date').count().reset_index()
neutral = df_tweets[df_tweets['polarity'] == 0].groupby('date').count().reset_index() 
positive=df_tweets[df_tweets['polarity']>0].groupby('date').count().reset_index()
负值=df_tweets[df_tweets['polarity']<0]。分组依据('date')。计数()。重置_索引()
neutral=df_tweets[df_tweets['polarity']==0].groupby('date').count().reset_index()
此代码的输出是三个数据帧,有两列:一列具有唯一的日期值,另一列具有更高、更小或等于0的极性计数。

考虑一个具有边距的数据帧。下面用随机种子数据演示:

数据

import numpy as np
import pandas as pd

np.random.seed(2112020)
random_df = pd.DataFrame({'date': np.random.choice(pd.date_range('2020-02-01', '2020-02-11'), 500),
                          'polarity': np.random.randint(-1, 2, 500)})

print(random_df.head(10))
#         date  polarity
# 0 2020-02-08        -1
# 1 2020-02-08         1
# 2 2020-02-06         0
# 3 2020-02-10        -1
# 4 2020-02-04        -1
# 5 2020-02-02         1
# 6 2020-02-05        -1
# 7 2020-02-04         0
# 8 2020-02-10         1
# 9 2020-02-09         0
聚合

pvt_df = (random_df.assign(day_date = lambda x: x['date'].dt.normalize(),
                           polarity_indicator = lambda x: np.select([x['polarity'] > 0, x['polarity'] < 0, x['polarity'] == 0],
                                                                    ['positive', 'negative', 'neutral']))
                   .pivot_table(index = 'day_date',
                                columns = 'polarity_indicator',
                                values = 'polarity',
                                aggfunc = 'count',
                                margins = True)
         )

print(pvt_df)

#  polarity_indicator   negative  neutral  positive  All
#  day_date
#  2020-02-01 00:00:00        17       14        16   47
#  2020-02-02 00:00:00        19       14        12   45
#  2020-02-03 00:00:00        11       16        12   39
#  2020-02-04 00:00:00        17       18        13   48
#  2020-02-05 00:00:00        11       15        22   48
#  2020-02-06 00:00:00        12       12        16   40
#  2020-02-07 00:00:00        16       15        21   52
#  2020-02-08 00:00:00        15       10        13   38
#  2020-02-09 00:00:00        17       15        19   51
#  2020-02-10 00:00:00        13       16        19   48
#  2020-02-11 00:00:00        13       12        19   44
#  All                       161      157       182  500
pvt_df=(随机分配(day_date=lambda x:x['date'].dt.normalize(),
极性指示器=λx:np。选择([x['polarity']>0,x['polarity']<0,x['polarity']==0],
[‘正’、‘负’、‘中性’]))
.pivot_表(索引='day_date',
列='polarity_indicator',
值='极性',
aggfunc='count',
边距=真)
)
打印(pvt_df)
#极性指示灯负极中性正极全部
#日期
#  2020-02-01 00:00:00        17       14        16   47
#  2020-02-02 00:00:00        19       14        12   45
#  2020-02-03 00:00:00        11       16        12   39
#  2020-02-04 00:00:00        17       18        13   48
#  2020-02-05 00:00:00        11       15        22   48
#  2020-02-06 00:00:00        12       12        16   40
#  2020-02-07 00:00:00        16       15        21   52
#  2020-02-08 00:00:00        15       10        13   38
#  2020-02-09 00:00:00        17       15        19   51
#  2020-02-10 00:00:00        13       16        19   48
#  2020-02-11 00:00:00        13       12        19   44
#全部161157182500

能否提供示例数据集以及预期输出?添加了预期输出。数据集是一个Excel工作表,列为“日期”(YYYY-MM-DD)格式和“极性”(每行的值介于-1和1之间)。
pvt_df = (random_df.assign(day_date = lambda x: x['date'].dt.normalize(),
                           polarity_indicator = lambda x: np.select([x['polarity'] > 0, x['polarity'] < 0, x['polarity'] == 0],
                                                                    ['positive', 'negative', 'neutral']))
                   .pivot_table(index = 'day_date',
                                columns = 'polarity_indicator',
                                values = 'polarity',
                                aggfunc = 'count',
                                margins = True)
         )

print(pvt_df)

#  polarity_indicator   negative  neutral  positive  All
#  day_date
#  2020-02-01 00:00:00        17       14        16   47
#  2020-02-02 00:00:00        19       14        12   45
#  2020-02-03 00:00:00        11       16        12   39
#  2020-02-04 00:00:00        17       18        13   48
#  2020-02-05 00:00:00        11       15        22   48
#  2020-02-06 00:00:00        12       12        16   40
#  2020-02-07 00:00:00        16       15        21   52
#  2020-02-08 00:00:00        15       10        13   38
#  2020-02-09 00:00:00        17       15        19   51
#  2020-02-10 00:00:00        13       16        19   48
#  2020-02-11 00:00:00        13       12        19   44
#  All                       161      157       182  500