我想根据同一数据帧中其他列的条件,从R数据帧中的一列生成8个名称组合
我有一个数据框,有来自4支不同球队的20名球员(每支球队5名球员),每个人都从一份梦幻选秀中获得了一份薪水。我希望能够创建8名球员的所有组合,他们的工资等于或低于10000英镑&他们的总分大于x,但不包括来自同一球队的4名或更多球员的任何组合 以下是我的数据框的外观:我想根据同一数据帧中其他列的条件,从R数据帧中的一列生成8个名称组合,r,dataframe,combinations,R,Dataframe,Combinations,我有一个数据框,有来自4支不同球队的20名球员(每支球队5名球员),每个人都从一份梦幻选秀中获得了一份薪水。我希望能够创建8名球员的所有组合,他们的工资等于或低于10000英镑&他们的总分大于x,但不包括来自同一球队的4名或更多球员的任何组合 以下是我的数据框的外观: Team Player K D A LH Points Salary PPS 4 ATN ExoticDeer 6.1 3.3 6.4 306.9 22.209
Team Player K D A LH Points Salary PPS
4 ATN ExoticDeer 6.1 3.3 6.4 306.9 22.209 1622 1.3692
2 ATN Supreme 6.8 5.3 7.1 229.4 21.954 1578 1.3913
1 ATN sasu 3.6 6.4 11.0 95.7 19.357 1244 1.5560
3 ATN eL lisasH 2 2.6 6.1 7.9 29.7 12.037 998 1.2061
5 ATN Nisha 2.7 5.6 7.5 48.2 12.282 955 1.2861
11 CL Swiftending 6.0 5.8 7.8 360.5 22.285 1606 1.3876
13 CL Pajkatt 13.3 7.5 9.3 326.8 37.248 1489 2.5015
15 CL SexyBamboe 6.3 8.5 9.3 168.0 20.660 1256 1.6449
14 CL EGM 2.8 6.0 13.5 78.8 21.988 989 2.2233
12 CL Saksa 2.5 6.5 10.5 59.8 15.898 967 1.6441
51 DBEARS Ace 7.0 3.4 6.9 195.6 23.596 1578 1.4953
31 DBEARS HesteJoe 5.4 5.4 6.1 176.7 16.927 1512 1.1195
61 DBEARS Miggel 2.8 6.8 11.0 141.8 17.818 1212 1.4701
21 DBEARS Noia 3.0 6.0 8.0 36.1 13.161 970 1.3568
41 DBEARS Ryze 2.7 4.7 6.7 74.6 12.166 937 1.2984
8 GB Keyser Soze 6.0 5.0 5.6 316.0 19.120 1602 1.1935
9 GB Madara 5.4 5.3 6.6 334.5 19.405 1577 1.2305
10 GB SkyLark 1.8 5.3 7.0 71.8 10.218 1266 0.8071
7 GB MNT 2.3 5.9 6.1 85.6 9.316 1007 0.9251
6 GB SKANKS224 1.4 7.6 7.4 52.5 7.565 954 0.7930
我遵循这篇文章中描述的一般概念:
调整代码以满足我的需要。这就是我到目前为止所做的:
## make a list of all combinations of 8 of Player, Points and Salary
xx <- with(FantasyPlayers, lapply(list(as.character(Player), Points, Salary), combn, 8))
## convert the names to a string,
## find the column sums of the others,
## set the names
yy <- setNames(
lapply(xx, function(x) {
if(typeof(x) == "character") apply(x, 2, toString) else colSums(x)
}),
names(FantasyPlayers)[c(2, 7, 8)]
)
## coerce to data.frame
newdf <- as.data.frame(yy)
这里有一个方法:
splt.names <- strsplit(as.character(newdf$Player), ", ")
indices <- lapply(splt.names, function(x) match(x, FantasyPlayers$Player))
exclude <- lapply(indices, function(x) any(table(FantasyPlayers$Team[x]) > 3))
newdf2 <- newdf[!unlist(exclude), ]
splt.names我认为,最好以长格式构建它:
组建团队
library(data.table)
setDT(FantasyPlayers)
xx <- combn(as.character(FantasyPlayers$Player), 8)
mxx <- setDT(melt(xx, varnames=c("jersey_no", "team_no"), value.name="Player"))
head(mxx,10)
# jersey_no team_no Player
# 1: 1 1 ExoticDeer
# 2: 2 1 Supreme
# 3: 3 1 sasu
# 4: 4 1 eL lisasH 2
# 5: 5 1 Nisha
# 6: 6 1 Swiftending
# 7: 7 1 Pajkatt
# 8: 8 1 SexyBamboe
# 9: 1 2 ExoticDeer
# 10: 2 2 Supreme
FantasyTeams <- FantasyPlayers[mxx, on="Player"]
# Team Player K D A LH Points Salary PPS jersey_no team_no
# 1: ATN ExoticDeer 6.1 3.3 6.4 306.9 22.209 1622 1.3692 1 1
# 2: ATN Supreme 6.8 5.3 7.1 229.4 21.954 1578 1.3913 2 1
# 3: ATN sasu 3.6 6.4 11.0 95.7 19.357 1244 1.5560 3 1
# 4: ATN eL lisasH 2 2.6 6.1 7.9 29.7 12.037 998 1.2061 4 1
# 5: ATN Nisha 2.7 5.6 7.5 48.2 12.282 955 1.2861 5 1
# ---
# 1007756: GB Keyser Soze 6.0 5.0 5.6 316.0 19.120 1602 1.1935 4 125970
# 1007757: GB Madara 5.4 5.3 6.6 334.5 19.405 1577 1.2305 5 125970
# 1007758: GB SkyLark 1.8 5.3 7.0 71.8 10.218 1266 0.8071 6 125970
# 1007759: GB MNT 2.3 5.9 6.1 85.6 9.316 1007 0.9251 7 125970
# 1007760: GB SKANKS224 1.4 7.6 7.4 52.5 7.565 954 0.7930 8 125970
默认情况下,仅打印data.table的第一行和最后几行。要检查整个过程,请尝试查看或查看print.data.table的参数
筛选到一组具有选定功能的团队
library(data.table)
setDT(FantasyPlayers)
xx <- combn(as.character(FantasyPlayers$Player), 8)
mxx <- setDT(melt(xx, varnames=c("jersey_no", "team_no"), value.name="Player"))
head(mxx,10)
# jersey_no team_no Player
# 1: 1 1 ExoticDeer
# 2: 2 1 Supreme
# 3: 3 1 sasu
# 4: 4 1 eL lisasH 2
# 5: 5 1 Nisha
# 6: 6 1 Swiftending
# 7: 7 1 Pajkatt
# 8: 8 1 SexyBamboe
# 9: 1 2 ExoticDeer
# 10: 2 2 Supreme
FantasyTeams <- FantasyPlayers[mxx, on="Player"]
# Team Player K D A LH Points Salary PPS jersey_no team_no
# 1: ATN ExoticDeer 6.1 3.3 6.4 306.9 22.209 1622 1.3692 1 1
# 2: ATN Supreme 6.8 5.3 7.1 229.4 21.954 1578 1.3913 2 1
# 3: ATN sasu 3.6 6.4 11.0 95.7 19.357 1244 1.5560 3 1
# 4: ATN eL lisasH 2 2.6 6.1 7.9 29.7 12.037 998 1.2061 4 1
# 5: ATN Nisha 2.7 5.6 7.5 48.2 12.282 955 1.2861 5 1
# ---
# 1007756: GB Keyser Soze 6.0 5.0 5.6 316.0 19.120 1602 1.1935 4 125970
# 1007757: GB Madara 5.4 5.3 6.6 334.5 19.405 1577 1.2305 5 125970
# 1007758: GB SkyLark 1.8 5.3 7.0 71.8 10.218 1266 0.8071 6 125970
# 1007759: GB MNT 2.3 5.9 6.1 85.6 9.316 1007 0.9251 7 125970
# 1007760: GB SKANKS224 1.4 7.6 7.4 52.5 7.565 954 0.7930 8 125970
要筛选到同一团队中不超过三名玩家的团队
my_teams <- FantasyTeams[, max(table(Team)) <= 3, by=team_no][V1==TRUE]$team_no
要节省一些按键和微秒,请用V1==TRUE
替换(V1)
。这是惯用的方式
从一组团队中恢复名册
library(data.table)
setDT(FantasyPlayers)
xx <- combn(as.character(FantasyPlayers$Player), 8)
mxx <- setDT(melt(xx, varnames=c("jersey_no", "team_no"), value.name="Player"))
head(mxx,10)
# jersey_no team_no Player
# 1: 1 1 ExoticDeer
# 2: 2 1 Supreme
# 3: 3 1 sasu
# 4: 4 1 eL lisasH 2
# 5: 5 1 Nisha
# 6: 6 1 Swiftending
# 7: 7 1 Pajkatt
# 8: 8 1 SexyBamboe
# 9: 1 2 ExoticDeer
# 10: 2 2 Supreme
FantasyTeams <- FantasyPlayers[mxx, on="Player"]
# Team Player K D A LH Points Salary PPS jersey_no team_no
# 1: ATN ExoticDeer 6.1 3.3 6.4 306.9 22.209 1622 1.3692 1 1
# 2: ATN Supreme 6.8 5.3 7.1 229.4 21.954 1578 1.3913 2 1
# 3: ATN sasu 3.6 6.4 11.0 95.7 19.357 1244 1.5560 3 1
# 4: ATN eL lisasH 2 2.6 6.1 7.9 29.7 12.037 998 1.2061 4 1
# 5: ATN Nisha 2.7 5.6 7.5 48.2 12.282 955 1.2861 5 1
# ---
# 1007756: GB Keyser Soze 6.0 5.0 5.6 316.0 19.120 1602 1.1935 4 125970
# 1007757: GB Madara 5.4 5.3 6.6 334.5 19.405 1577 1.2305 5 125970
# 1007758: GB SkyLark 1.8 5.3 7.0 71.8 10.218 1266 0.8071 6 125970
# 1007759: GB MNT 2.3 5.9 6.1 85.6 9.316 1007 0.9251 7 125970
# 1007760: GB SKANKS224 1.4 7.6 7.4 52.5 7.565 954 0.7930 8 125970
要获取与每个团队相关联的名册,请加入/合并mxx
mxx[.(team_no = my_new_teams), on="team_no"]
如果您希望在一行中列出玩家,如OP中所示:
mxx[.(team_no = my_new_teams), .(roster = toString(Player)), on="team_no", by=.EACHI]
如果您想要每个团队的汇总统计数据,则需要加入FantasyTeams
:
FantasyTeams[.(team_no = my_new_teams), .(
roster = toString(Player),
tot_salary = sum(Salary),
tot_points = sum(Points)
), on="team_no", by=.EACHI]
# team_no roster tot_salary tot_points
# 1: 3716 ExoticDeer, Supreme, sasu, Swiftending, EGM, Saksa, Noia, Ryze 9913 149.018
# 2: 3720 ExoticDeer, Supreme, sasu, Swiftending, EGM, Saksa, Noia, MNT 9983 146.168
# 3: 3721 ExoticDeer, Supreme, sasu, Swiftending, EGM, Saksa, Noia, SKANKS224 9930 144.417
# 4: 3725 ExoticDeer, Supreme, sasu, Swiftending, EGM, Saksa, Ryze, MNT 9950 145.173
# 5: 3726 ExoticDeer, Supreme, sasu, Swiftending, EGM, Saksa, Ryze, SKANKS224 9897 143.422
# ---
# 40202: 125663 EGM, Saksa, Miggel, Noia, Ryze, Keyser Soze, MNT, SKANKS224 8638 117.032
# 40203: 125664 EGM, Saksa, Miggel, Noia, Ryze, Madara, SkyLark, MNT 8925 119.970
# 40204: 125665 EGM, Saksa, Miggel, Noia, Ryze, Madara, SkyLark, SKANKS224 8872 118.219
# 40205: 125666 EGM, Saksa, Miggel, Noia, Ryze, Madara, MNT, SKANKS224 8613 117.317
# 40206: 125667 EGM, Saksa, Miggel, Noia, Ryze, SkyLark, MNT, SKANKS224 8302 108.130
要理解by=.EACHI
在做什么,需要一点背景知识。这里的合并语法是DT[i,j,on=cols,by=.EACHI]
- 如果省略了
和j
,它只进行合并,就像在by
的构造中一样李>FantasyTeams
- 如果省略了
,但包含了by
,则在合并后计算j
j
- 如果
,则对by=.EACHI
中的每个值分别计算i
j
选择(40,8)
查看组合的数量。比率choose(40,8)/choose(20,8)
表明您需要600倍的空间。@GodlikeRoy如果这个答案对您有效,您可以接受它,如帮助中心所述:如果您有一个与原来问的问题大不相同的问题,请随意将其作为新问题发布。