Python 使用熊猫按天分离数据集
我有一个像这样的数据集Python 使用熊猫按天分离数据集,python,pandas,dataset,Python,Pandas,Dataset,我有一个像这样的数据集 "2018-05-30 21:26:43",20.61129150,-100.40933971 "2018-05-30 21:26:43",20.61127415,-100.41146822 "2018-06-02 21:56:12",21.15633228,-100.93766080 "2018-06-05 22:57:40",20.59734201,-100.38091286 "2018-06-05 22:57:40",20.59875096,-100.3782142
"2018-05-30 21:26:43",20.61129150,-100.40933971
"2018-05-30 21:26:43",20.61127415,-100.41146822
"2018-06-02 21:56:12",21.15633228,-100.93766080
"2018-06-05 22:57:40",20.59734201,-100.38091286
"2018-06-05 22:57:40",20.59875096,-100.37821426
"2018-06-06 20:56:22",20.61278120,-100.38446619
"2018-06-06 20:56:22",20.59865452,-100.37827264
"2018-06-06 21:57:15",20.59862012,-100.37817348
"2018-06-06 21:57:15",20.59864713,-100.37821263
"2018-06-06 21:57:15",20.59862915,-100.37825902
"2018-06-07 15:54:29",20.61280757,-100.39768857
"2018-06-07 15:54:29",20.61276216,-100.39769379
我想将我的数据分为几组,这样我就可以计算距离,并得出每天平均行驶的距离
我现在用我的日期栏来分隔它,如下所示:
col_names = ['date', 'latitude', 'longitude']
df = pd.read_csv('marco.csv', names=col_names, sep=',', skiprows=1)
# merge
m = df.reset_index().merge(df.reset_index(), on='date')
但是我想把它按天分开,这样我就可以得到
2018-05-30, 2018-06-05, 2018-06-06, 2018-06-07
我将如何处理这个问题?正如尤卡所提到的,分组应该做到这一点。我将创建一个名为“day”的新列,其中只包含时间戳中的日期,按日期排序,按“日期”分组,然后计算每组中的行程
import pandas as pd
a = pd.DataFrame(
[["2018-05-30 21:26:43",20.61129150,-100.40933971],
["2018-05-30 21:26:43",20.61127415,-100.41146822],
["2018-06-02 21:56:12",21.15633228,-100.93766080],
["2018-06-05 22:57:40",20.59734201,-100.38091286]],
columns=['date', 'lat', 'lng'])
a['date'] = pd.to_datetime(a['date'])
a['day'] = a['date'].dt.date
b = a.groupby('day')
# Loop over the groups and do whatever calculation you need
for tup in b:
group = tup[0]
df = tup[1]
print df['lat'].sum()
你试过分组吗?看看pd-to-datetime