R 按灵活的标准排序
我正在使用协同过滤构建一个产品推荐引擎(在R中)。为了使更多的盈利项目保持在建议的顶部,我们开发了一个灵活的业务规则,如图1所示。应该使用业务规则对推荐程序输出进行排序R 按灵活的标准排序,r,algorithm,sorting,R,Algorithm,Sorting,我正在使用协同过滤构建一个产品推荐引擎(在R中)。为了使更多的盈利项目保持在建议的顶部,我们开发了一个灵活的业务规则,如图1所示。应该使用业务规则对推荐程序输出进行排序 +---------------+----------+-----------------+ | Sort Priority | Level 1 | Level 2 | +---------------+----------+-----------------+ | 1 | Brand
+---------------+----------+-----------------+
| Sort Priority | Level 1 | Level 2 |
+---------------+----------+-----------------+
| 1 | Brand | Versatile Foods |
+---------------+----------+-----------------+
| | | Agro |
+---------------+----------+-----------------+
| | | Specialty Foods |
+---------------+----------+-----------------+
| | | |
+---------------+----------+-----------------+
| 2 | Category | Dairy |
+---------------+----------+-----------------+
| | | Produce |
+---------------+----------+-----------------+
| | | Seafood |
+---------------+----------+-----------------+
| | | |
+---------------+----------+-----------------+
| 3 | Seasonal | Y |
+---------------+----------+-----------------+
| | | N |
+---------------+----------+-----------------+
figure 1
业务规则:在对表进行排序时,品牌列应优先于
应优先于季节性的类别。这由列排序优先级的值决定
在品牌栏中分类时,多功能食品优先于农产品和农产品
而不是特色食品。
如果品牌栏中的值未出现
在规则中,值必须按字母顺序排序。相同的排序逻辑应适用于规则定义中的每个条目 随着推荐算法的发展。可以更改/编辑业务规则,使其具有更少或更多级别。例如,将来可能会增加一个额外的1级条目,例如,类型(犹太、素食、清真)等。规则如下所示:
+---------------+----------+-----------------+
| Sort Priority | Level 1 | Level 2 |
+---------------+----------+-----------------+
| 1 | Brand | Versatile Foods |
+---------------+----------+-----------------+
| | | Agro |
+---------------+----------+-----------------+
| | | Specialty Foods |
+---------------+----------+-----------------+
| | | |
+---------------+----------+-----------------+
| 2 | Category | Dairy |
+---------------+----------+-----------------+
| | | Produce |
+---------------+----------+-----------------+
| | | Seafood |
+---------------+----------+-----------------+
| | | |
+---------------+----------+-----------------+
| 3 | Type | Kosher |
+---------------+----------+-----------------+
| | | Halal |
+---------------+----------+-----------------+
| | | Vegan |
+---------------+----------+-----------------+
| | | |
+---------------+----------+-----------------+
| 4 | Seasonal | Y |
+---------------+----------+-----------------+
| | | N |
+---------------+----------+-----------------+
figure 2
我需要帮助在R中构建一个脚本,该脚本将根据前面提到的业务规则对上面的表(加载到数据框中)进行排序。
我想解决的真正问题是,我不想每次在规则中添加新条目时都更改代码
输入数据(由推荐引擎输出)将是这类数据(图3)
使用如图1所示的规则定义,脚本的输出应该与下表类似。
请注意,Brand=USA bread(未出现在业务规则中)如何放置在已排序列表的底部。
此外,对于第4项和第6项,类别为'product'的记录放在类别为'Meat'的记录之上,因为在业务规则中找不到条目'Meat',但却找到了'product'
+-----+-----------------+----------+----------+
| SKU | Brand | Category | Seasonal |
+-----+-----------------+----------+----------+
| 1 | Versatile Foods | Dairy | Y |
+-----+-----------------+----------+----------+
| 7 | Versatile Foods | Seafood | N |
+-----+-----------------+----------+----------+
| 10 | Versatile Foods | Seafood | N |
+-----+-----------------+----------+----------+
| 2 | Agro | Produce | Y |
+-----+-----------------+----------+----------+
| 4 | Agro | Produce | N |
+-----+-----------------+----------+----------+
| 6 | Agro | Meat | N |
+-----+-----------------+----------+----------+
| 3 | Specialty Foods | Seafood | N |
+-----+-----------------+----------+----------+
| 9 | Specialty Foods | Seafood | N |
+-----+-----------------+----------+----------+
| 5 | Specialty Foods | Organic | Y |
+-----+-----------------+----------+----------+
| 8 | USA bread | Bakery | Y |
+-----+-----------------+----------+----------+
figure 4
您可以使用因子编码按自己的意愿排序。例如:
> lvl <- c('Versatile Foods', 'Agro', 'Specialty Foods')
> lvl <- append(lvl, sort(setdiff(unique(df$Brand), lvl)))
>
> df$Brand <- factor(df$Brand, levels=lvl)
>
> lvl <- c("Dairy", "Produce", "Seafood")
> lvl <- append(lvl, sort(setdiff(unique(df$Category), lvl)))
>
> df$Category <- factor(df$Category, levels=lvl)
>
> df$Seasonal <- factor(df$Seasonal, levels=c('Y', 'N'))
>
>
> df[order(df$Brand, df$Category, df$Seasonal), ]
SKU Brand Category Seasonal
1 1 Versatile Foods Dairy Y
7 7 Versatile Foods Seafood N
10 10 Versatile Foods Seafood N
2 2 Agro Produce Y
4 4 Agro Produce N
6 6 Agro Produce N
3 3 Specialty Foods Seafood N
9 9 Specialty Foods Seafood N
5 5 Specialty Foods Organic Y
8 8 USA Bread Bakery Y
>lvl lvl
>df$品牌
>lvl
>df$类别
>df$季节性
>
>df[订单(df$品牌、df$类别、df$季节性),]
季节性SKU品牌类别
1多功能食品乳制品
7多种食品海鲜
多功能食品海鲜
2.农产品销售
农产品
农产品
特色食品海鲜
特色食品海鲜
5特色食品有机食品
8美国面包烘焙店
此方法涉及定义排序秩表,然后在与主表合并后使用新列执行排序
library(dplyr)
rank <- data_frame(Brand = c('Versatile Foods','Agro','Specialty Foods'),
Brand_rank = c(1,2,3))
df <- left_join(df, rank, on="Brand") %>%
arrange(Brand_rank, Brand, Category, Seasonal) %>%
select(-Brand_rank)
df
# A tibble: 10 × 4
# SKU Brand Category Seasonal
# <dbl> <chr> <chr> <chr>
#1 1 Versatile Foods Dairy Y
#2 7 Versatile Foods Seafood N
#3 10 Versatile Foods Seafood N
#4 4 Agro Produce N
#5 6 Agro Produce N
#6 2 Agro Produce Y
#7 5 Specialty Foods Organic Y
#8 3 Specialty Foods Seafood N
#9 9 Specialty Foods Seafood N
#10 8 USA Bread Bakery Y
库(dplyr)
排名%
选择(-Brand_rank)
df
#一个tibble:10×4
#季节性SKU品牌类别
#
#1多功能食品乳制品
#2.7多功能食品海鲜
#3 10多功能食品海鲜
#农产品
#农产品
#6.2农产品销售
#7.5特色食品有机食品
#特色食品海鲜
#特色食品海鲜
#10 8美国面包烘焙店
谢谢。这将部分解决我的问题。但是,业务规则是动态的,可能有更多的级别,例如类型(犹太、纯素食、清真)等。我以为您希望在最后将额外的内容排序为alpha?如果我的评论不清楚,我很抱歉。我的意思是,我们可能会改变商业规则,在规则的末尾添加类型(犹太教、素食、清真)。我不想硬编码df[订单(df$品牌、df$类别、df$季节、df$类型)中的因素,因为列表可能随时都在变化。谢谢您的帮助。我想我可以动态地阅读专栏,避开这个问题。我编辑了我的问题,以便更清楚。我编辑了我的问题。有人能离开拘留所吗?
library(dplyr)
rank <- data_frame(Brand = c('Versatile Foods','Agro','Specialty Foods'),
Brand_rank = c(1,2,3))
df <- left_join(df, rank, on="Brand") %>%
arrange(Brand_rank, Brand, Category, Seasonal) %>%
select(-Brand_rank)
df
# A tibble: 10 × 4
# SKU Brand Category Seasonal
# <dbl> <chr> <chr> <chr>
#1 1 Versatile Foods Dairy Y
#2 7 Versatile Foods Seafood N
#3 10 Versatile Foods Seafood N
#4 4 Agro Produce N
#5 6 Agro Produce N
#6 2 Agro Produce Y
#7 5 Specialty Foods Organic Y
#8 3 Specialty Foods Seafood N
#9 9 Specialty Foods Seafood N
#10 8 USA Bread Bakery Y