Python 数据帧中数值范围的透视表

Python 数据帧中数值范围的透视表,python,range,pivot-table,Python,Range,Pivot Table,您好,我想用数据框制作一个数据透视表,根据公司在网站上的上传次数列出这些公司。以下是我所拥有的: df 期望输出 Range Company Uploads >10 Tesla 3 11-50 Adidas 26 Tiffany 19 Target 18

您好,我想用数据框制作一个数据透视表,根据公司在网站上的上传次数列出这些公司。以下是我所拥有的:

df

期望输出

Range            Company        Uploads
>10              Tesla             3
11-50            Adidas            26
                 Tiffany           19
                 Target            18
                 Nike              11
51-100           Amazon            97
                 Google            81
                 Walmart           77
                 Apple             55
                 Ralph Lauren      54
我正在考虑使用np.where在df中添加一个“Range”列。然后使用pd.pivot\u table或.groupby创建透视表。然后对pivot表中的递减上载编号的值进行排序


我不确定这是否有效。谁能帮我一下吗?我感谢任何帮助。提前谢谢

您需要的是多索引,而不是
groupby()

首先创建一个列,按照您的建议对上载内容进行分类:

df = df.sort_values('Uploads',ascending=False)
df['Range'] = np.digitize(df['Uploads'],[0,11,51,100]) #bins <=10, 11-50, 50-100
#only handles up to 100, if there are values above 100 you need to expand that second list
输出:

                  Uploads
Range  Company           
<10    Tesla            3
11-50  Facebook        48
       Adidas          26
       Tiffany         19
       Target          18
       Nike            11
51-100 Amazon          97
       Google          81
       Walmart         77
       Apple           55
       Ralph           54
上传
靶场公司
您可以使用具有装箱功能的
pd.cut()
,对段进行分类并使用标签输出的名称

import pandas as pd
import numpy as np
import io

data = '''
Company Uploads
Nike 11
Adidas 26
Apple 55
Tesla 3
Amazon 97
"Ralph Lauren" 54
Tiffany 19
Walmart 77
Target 18
Facebook 48
Google 81
'''

df = pd.read_csv(io.StringIO(data), sep='\s+')
df['category'] = pd.cut(df['Uploads'], [0,10,50,100], labels=['>10','11-50','51-100'])
df.sort_values(['category','Uploads'], ascending=[True, True], inplace=True)
df.set_index(['category','Company'],inplace=True)
df

Uploads
category    Company 
>10     Tesla   3
 11-50  Nike    11
        Target  18
        Tiffany 19
        Adidas  26
        Facebook    48
51-100  Ralph Lauren    54
        Apple   55
        Walmart 77
        Google  81
        Amazon  97
请使用df.set_索引(['Category','Company'])获取所需的输出。
df.set_index(['Range','Company'])
                  Uploads
Range  Company           
<10    Tesla            3
11-50  Facebook        48
       Adidas          26
       Tiffany         19
       Target          18
       Nike            11
51-100 Amazon          97
       Google          81
       Walmart         77
       Apple           55
       Ralph           54
import pandas as pd
import numpy as np
import io

data = '''
Company Uploads
Nike 11
Adidas 26
Apple 55
Tesla 3
Amazon 97
"Ralph Lauren" 54
Tiffany 19
Walmart 77
Target 18
Facebook 48
Google 81
'''

df = pd.read_csv(io.StringIO(data), sep='\s+')
df['category'] = pd.cut(df['Uploads'], [0,10,50,100], labels=['>10','11-50','51-100'])
df.sort_values(['category','Uploads'], ascending=[True, True], inplace=True)
df.set_index(['category','Company'],inplace=True)
df

Uploads
category    Company 
>10     Tesla   3
 11-50  Nike    11
        Target  18
        Tiffany 19
        Adidas  26
        Facebook    48
51-100  Ralph Lauren    54
        Apple   55
        Walmart 77
        Google  81
        Amazon  97