Python 在列上应用timegrouper 让我们考虑下面的数据帧: data={'close': 1.16155, 'datetime': Timestamp('2017-11-01 22:29:40'), 'high': 1.16155, 'low': 1.16155, 'open': 1.16155, 'symbol': 'European Monetary Union Euro - United States dollar', 'volume': -1.0}, {'close': 1.00325, 'datetime': Timestamp('2017-11-01 22:29:40'), 'high': 1.00325, 'low': 1.00325, 'open': 1.00325, 'symbol': 'United States dollar - Swiss franc', 'volume': -1.0}, {'close': 1.324475, 'datetime': Timestamp('2017-11-01 22:29:40'), 'high': 1.324475, 'low': 1.324475, 'open': 1.324475, 'symbol': 'British pound - United States dollar', 'volume': -1.0}, {'close': 1.324475, 'datetime': Timestamp('2017-11-01 22:29:45'), 'high': 1.324475, 'low': 1.324475, 'open': 1.324475, 'symbol': 'British pound - United States dollar', 'volume': -1.0}, {'close': 1.16155, 'datetime': Timestamp('2017-11-01 22:29:45'), 'high': 1.16155, 'low': 1.16155, 'open': 1.16155, 'symbol': 'European Monetary Union Euro - United States dollar', 'volume': -1.0}] df=pd.DataFrame(data)

Python 在列上应用timegrouper 让我们考虑下面的数据帧: data={'close': 1.16155, 'datetime': Timestamp('2017-11-01 22:29:40'), 'high': 1.16155, 'low': 1.16155, 'open': 1.16155, 'symbol': 'European Monetary Union Euro - United States dollar', 'volume': -1.0}, {'close': 1.00325, 'datetime': Timestamp('2017-11-01 22:29:40'), 'high': 1.00325, 'low': 1.00325, 'open': 1.00325, 'symbol': 'United States dollar - Swiss franc', 'volume': -1.0}, {'close': 1.324475, 'datetime': Timestamp('2017-11-01 22:29:40'), 'high': 1.324475, 'low': 1.324475, 'open': 1.324475, 'symbol': 'British pound - United States dollar', 'volume': -1.0}, {'close': 1.324475, 'datetime': Timestamp('2017-11-01 22:29:45'), 'high': 1.324475, 'low': 1.324475, 'open': 1.324475, 'symbol': 'British pound - United States dollar', 'volume': -1.0}, {'close': 1.16155, 'datetime': Timestamp('2017-11-01 22:29:45'), 'high': 1.16155, 'low': 1.16155, 'open': 1.16155, 'symbol': 'European Monetary Union Euro - United States dollar', 'volume': -1.0}] df=pd.DataFrame(data),python,pandas,Python,Pandas,我想使用groupby按symbol和datetime分组,而不将索引设置为symbol或datetime 理想情况下,结果应该是这样的:df.groupby([“symbol”,pd.TimeGrouper(“30T”,“datetime”)]).count() 我知道可以通过做 df.set_index(“datetime”).groupby([“symbol”,pd.TimeGrouper(“30T”)]).count() 但是,我还是希望不将索引设置为datetime或symbol

我想使用groupby按
symbol
datetime
分组,而不将索引设置为
symbol
datetime

理想情况下,结果应该是这样的:
df.groupby([“symbol”,pd.TimeGrouper(“30T”,“datetime”)]).count()

  • 我知道可以通过做
    df.set_index(“datetime”).groupby([“symbol”,pd.TimeGrouper(“30T”)]).count()
但是,我还是希望不将索引设置为
datetime
symbol

Thx!

这就是你想要的吗

In [198]: df.groupby(["symbol",pd.TimeGrouper("30T", key="datetime")]).count()
Out[198]:
                                                                        close  high  low  open  volume
symbol                                             datetime
British pound - United States dollar               2017-11-01 22:00:00      2     2    2     2       2
European Monetary Union Euro - United States do... 2017-11-01 22:00:00      2     2    2     2       2
United States dollar - Swiss franc                 2017-11-01 22:00:00      1     1    1     1       1
或使用:

TimeGrouper
的PS DocString可以更详细一些:

In [204]: pd.TimeGrouper?
Init signature: pd.TimeGrouper(*args, **kwargs)
Docstring:
Custom groupby class for time-interval grouping

Parameters
----------
freq : pandas date offset or offset alias for identifying bin edges
closed : closed end of interval; left or right
label : interval boundary to use for labeling; left or right
nperiods : optional, integer
convention : {'start', 'end', 'e', 's'}
    If axis is PeriodIndex
适合于
pd.Grouper

In [205]: pd.Grouper?
Init signature: pd.Grouper(*args, **kwargs)
Docstring:
A Grouper allows the user to specify a groupby instruction for a target
object

This specification will select a column via the key parameter, or if the
level and/or axis parameters are given, a level of the index of the target
object.

These are local specifications and will override 'global' settings,
that is the parameters axis and level which are passed to the groupby
itself.

Parameters
----------
key : string, defaults to None
    groupby key, which selects the grouping column of the target
level : name/number, defaults to None
    the level for the target index
freq : string / frequency object, defaults to None
    This will groupby the specified frequency if the target selection
    (via key or level) is a datetime-like object. For full specification
    of available frequencies, please see `here
    <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`_.
axis : number/name of the axis, defaults to 0
sort : boolean, default to False
    whether to sort the resulting labels

additional kwargs to control time-like groupers (when freq is passed)

closed : closed end of interval; left or right
label : interval boundary to use for labeling; left or right
convention : {'start', 'end', 'e', 's'}
    If grouper is PeriodIndex

Returns
-------
A specification for a groupby instruction

Examples
--------

Syntactic sugar for ``df.groupby('A')``

>>> df.groupby(Grouper(key='A'))

Specify a resample operation on the column 'date'

>>> df.groupby(Grouper(key='date', freq='60s'))

Specify a resample operation on the level 'date' on the columns axis
with a frequency of 60s
[205]中的
pd.石斑鱼?
初始签名:pd.gropper(*args,**kwargs)
文档字符串:
Grouper允许用户为目标指定groupby指令
对象
此规范将通过键参数选择列,或者
给出了级别和/或轴参数,即目标的索引级别
对象
这些是本地规范,将覆盖“全局”设置,
这是传递给groupby的参数axis和level
它本身
参数
----------
键:字符串,默认为无
groupby键,用于选择目标的分组列
级别:名称/编号,默认为无
目标索引的级别
频率:字符串/频率对象,默认为无
如果目标选择失败,将按指定的频率分组
(通过键或级别)是类似datetime的对象。用于完整规范
关于可用频率,请参见此处
`_.
轴:轴的编号/名称,默认为0
排序:布尔值,默认为False
是否对结果标签进行排序
额外的KWARG可以像石斑鱼一样控制时间(当通过freq时)
闭合:间隔的闭合端;左侧或右侧
标签:用于标签的间隔边界;左侧或右侧
约定:{'start','end','e','s'}
如果石斑鱼是指数
退换商品
-------
groupby指令的规范
例子
--------
“`df.groupby('A')的语法糖``
>>>groupby(Grouper(key='A'))
在“日期”列上指定重采样操作
>>>df.groupby(石斑鱼(key='date',freq='60s'))
在列轴上的“日期”级别上指定重采样操作
频率为60秒

Awesome thx!这是一个真正的thx!上的文档中没有文档或示例timegrouper@jimbasquiat熊猫把石斑鱼变成了熊猫。Grouper@MaxU:-)TimeGrouper不是公共接口
In [205]: pd.Grouper?
Init signature: pd.Grouper(*args, **kwargs)
Docstring:
A Grouper allows the user to specify a groupby instruction for a target
object

This specification will select a column via the key parameter, or if the
level and/or axis parameters are given, a level of the index of the target
object.

These are local specifications and will override 'global' settings,
that is the parameters axis and level which are passed to the groupby
itself.

Parameters
----------
key : string, defaults to None
    groupby key, which selects the grouping column of the target
level : name/number, defaults to None
    the level for the target index
freq : string / frequency object, defaults to None
    This will groupby the specified frequency if the target selection
    (via key or level) is a datetime-like object. For full specification
    of available frequencies, please see `here
    <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`_.
axis : number/name of the axis, defaults to 0
sort : boolean, default to False
    whether to sort the resulting labels

additional kwargs to control time-like groupers (when freq is passed)

closed : closed end of interval; left or right
label : interval boundary to use for labeling; left or right
convention : {'start', 'end', 'e', 's'}
    If grouper is PeriodIndex

Returns
-------
A specification for a groupby instruction

Examples
--------

Syntactic sugar for ``df.groupby('A')``

>>> df.groupby(Grouper(key='A'))

Specify a resample operation on the column 'date'

>>> df.groupby(Grouper(key='date', freq='60s'))

Specify a resample operation on the level 'date' on the columns axis
with a frequency of 60s