如何对字符串中的字母进行计数,并返回R中数据帧行中出现的最高字母
我在数据框中有一列,由描述风向的字母组成。我需要为每一行找到最常见的方向,包括计算每个字母出现的次数,然后选择最常见的字母。这是数据帧的一个示例:如何对字符串中的字母进行计数,并返回R中数据帧行中出现的最高字母,r,R,我在数据框中有一列,由描述风向的字母组成。我需要为每一行找到最常见的方向,包括计算每个字母出现的次数,然后选择最常见的字母。这是数据帧的一个示例: structure(list(Day = c("15", "16", "17", "18", "19", "20"), Month = structure(c(4L, 4L, 4L, 4L, 4L, 4L), .Label = c
structure(list(Day = c("15", "16", "17", "18", "19", "20"), Month = structure(c(4L,
4L, 4L, 4L, 4L, 4L), .Label = c("Dec", "Nov", "Oct", "Sep"), class = "factor"),
Year = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2012",
"2013", "2014", "2015", "2018", "2019", "2020"), class = "factor"),
Time = structure(c(10L, 10L, 10L, 10L, 10L, 10L), .Label = c("1-2pm",
"10-11am", "11-12am", "12-1pm", "2-3pm", "3-4pm", "4-5pm",
"5-6pm", "7-8am", "8-9am", "9-10am"), class = "factor"),
Direction_Abrev = c("S-SE", "S-SE", "SW-S", "W-SE", "W-SW",
"SW-S")), row.names = c(NA, 6L), class = "data.frame")
我希望生成的数据帧如下所示:
Day Month Year Time Direction_Abrev
1 15 Sep 2013 8-9am S
2 16 Sep 2013 8-9am S
3 17 Sep 2013 8-9am S
4 18 Sep 2013 8-9am W-SE
5 19 Sep 2013 8-9am W
6 20 Sep 2013 8-9am S
返回最常见的字母。有一个问题(如第4行),所有字母都是相同的。在这些情况下,如果可能的话,我希望返回原始值。
提前感谢。sapply(dat$Direction\u Abrev,函数){
计数sapply(数据$Direction\u Abrev,函数){
计数这里是一个使用strsplit
+intersect
transform(
df,
Direction_Abrev = unlist(
ifelse(
lengths(
v <- sapply(
strsplit(Direction_Abrev, "-"),
function(x) do.call(intersect, strsplit(x, ""))
)
),
v,
Direction_Abrev
)
)
)
下面是一个使用strsplit的基本R选项
+intersect
transform(
df,
Direction_Abrev = unlist(
ifelse(
lengths(
v <- sapply(
strsplit(Direction_Abrev, "-"),
function(x) do.call(intersect, strsplit(x, ""))
)
),
v,
Direction_Abrev
)
)
)