Python 数据帧中数值范围的透视表
您好,我想用数据框制作一个数据透视表,根据公司在网站上的上传次数列出这些公司。以下是我所拥有的: df 期望输出Python 数据帧中数值范围的透视表,python,range,pivot-table,Python,Range,Pivot Table,您好,我想用数据框制作一个数据透视表,根据公司在网站上的上传次数列出这些公司。以下是我所拥有的: df 期望输出 Range Company Uploads >10 Tesla 3 11-50 Adidas 26 Tiffany 19 Target 18
Range Company Uploads
>10 Tesla 3
11-50 Adidas 26
Tiffany 19
Target 18
Nike 11
51-100 Amazon 97
Google 81
Walmart 77
Apple 55
Ralph Lauren 54
我正在考虑使用np.where在df中添加一个“Range”列。然后使用pd.pivot\u table或.groupby创建透视表。然后对pivot表中的递减上载编号的值进行排序
我不确定这是否有效。谁能帮我一下吗?我感谢任何帮助。提前谢谢 您需要的是多索引,而不是
groupby()
首先创建一个列,按照您的建议对上载内容进行分类:
df = df.sort_values('Uploads',ascending=False)
df['Range'] = np.digitize(df['Uploads'],[0,11,51,100]) #bins <=10, 11-50, 50-100
#only handles up to 100, if there are values above 100 you need to expand that second list
输出:
Uploads
Range Company
<10 Tesla 3
11-50 Facebook 48
Adidas 26
Tiffany 19
Target 18
Nike 11
51-100 Amazon 97
Google 81
Walmart 77
Apple 55
Ralph 54
上传
靶场公司
您可以使用具有装箱功能的pd.cut()
,对段进行分类并使用标签输出的名称
import pandas as pd
import numpy as np
import io
data = '''
Company Uploads
Nike 11
Adidas 26
Apple 55
Tesla 3
Amazon 97
"Ralph Lauren" 54
Tiffany 19
Walmart 77
Target 18
Facebook 48
Google 81
'''
df = pd.read_csv(io.StringIO(data), sep='\s+')
df['category'] = pd.cut(df['Uploads'], [0,10,50,100], labels=['>10','11-50','51-100'])
df.sort_values(['category','Uploads'], ascending=[True, True], inplace=True)
df.set_index(['category','Company'],inplace=True)
df
Uploads
category Company
>10 Tesla 3
11-50 Nike 11
Target 18
Tiffany 19
Adidas 26
Facebook 48
51-100 Ralph Lauren 54
Apple 55
Walmart 77
Google 81
Amazon 97
请使用df.set_索引(['Category','Company'])获取所需的输出。
df.set_index(['Range','Company'])
Uploads
Range Company
<10 Tesla 3
11-50 Facebook 48
Adidas 26
Tiffany 19
Target 18
Nike 11
51-100 Amazon 97
Google 81
Walmart 77
Apple 55
Ralph 54
import pandas as pd
import numpy as np
import io
data = '''
Company Uploads
Nike 11
Adidas 26
Apple 55
Tesla 3
Amazon 97
"Ralph Lauren" 54
Tiffany 19
Walmart 77
Target 18
Facebook 48
Google 81
'''
df = pd.read_csv(io.StringIO(data), sep='\s+')
df['category'] = pd.cut(df['Uploads'], [0,10,50,100], labels=['>10','11-50','51-100'])
df.sort_values(['category','Uploads'], ascending=[True, True], inplace=True)
df.set_index(['category','Company'],inplace=True)
df
Uploads
category Company
>10 Tesla 3
11-50 Nike 11
Target 18
Tiffany 19
Adidas 26
Facebook 48
51-100 Ralph Lauren 54
Apple 55
Walmart 77
Google 81
Amazon 97