Python 转换类别、开始时间、结束时间数据框,用于在熊猫中打印

Python 转换类别、开始时间、结束时间数据框,用于在熊猫中打印,python,pandas,Python,Pandas,我有一个数据框: df.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 32656 entries, 94418 to 2 Data columns (total 8 columns): customer_id 32656 non-null object session_id 32656 non-null int64 start 3

我有一个数据框:

df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 32656 entries, 94418 to 2
Data columns (total 8 columns):
customer_id             32656 non-null object
session_id              32656 non-null int64
start                   32656 non-null datetime64[ns, America/Los_Angeles]
end                     32656 non-null datetime64[ns, America/Los_Angeles]
length                  32656 non-null timedelta64[ns]
category                32656 non-null object
rounded_start           32656 non-null datetime64[ns, America/Los_Angeles]
rounded_end             32656 non-null datetime64[ns, America/Los_Angeles]
dtypes: datetime64[ns, America/Los_Angeles](4), int64(1), object(2), timedelta64[ns](1)
memory usage: 2.2+ MB

如何将这两个数据集连接在一起,以便在x轴上绘制范围内的每个点,并显示在此期间包含的类别数?

我怀疑这会起作用,尽管我尚未验证

df_count = pd.DataFrame(index=rng)

def count_cats(x, df):
    date = x.name[0]
    condition1 = df.start <= date
    condition2 = df.end >= date
    df_slice = df.loc[condition1 & condition2, 'category']
    return pd.Series([df_slice.unique().size], index=['CountCats'])

df_count = df_count.apply(lambda x: count_cats(x, df)) 
df_count=pd.DataFrame(index=rng)
def计数(x,df):
日期=x.name[0]
条件1=df.start=date
df_slice=df.loc[条件1和条件2,'类别']
返回pd.Series([df_slice.unique().size],索引=['CountCats'])
df_count=df_count.apply(λx:count_cat(x,df))

虽然我还没有验证,但我怀疑这会起作用

df_count = pd.DataFrame(index=rng)

def count_cats(x, df):
    date = x.name[0]
    condition1 = df.start <= date
    condition2 = df.end >= date
    df_slice = df.loc[condition1 & condition2, 'category']
    return pd.Series([df_slice.unique().size], index=['CountCats'])

df_count = df_count.apply(lambda x: count_cats(x, df)) 
df_count=pd.DataFrame(index=rng)
def计数(x,df):
日期=x.name[0]
条件1=df.start=date
df_slice=df.loc[条件1和条件2,'类别']
返回pd.Series([df_slice.unique().size],索引=['CountCats'])
df_count=df_count.apply(λx:count_cat(x,df))

为了便于将来参考,如果提问者提供代码来复制一些要处理的样本数据,那么回答问题就容易多了。当你的问题出现时,提供答案的人将无法验证你的期望值。谢谢:)熊猫第一周-仍在学习诀窍!作为将来的参考,如果询问者提供代码来复制一些要处理的样本数据,那么回答问题就容易多了。当你的问题出现时,提供答案的人将无法验证你的期望值。谢谢:)熊猫第一周-仍在学习诀窍!