R 是否可以基于变量标签选择列?
对于非常宽的数据集,是否可以使用变量标签来选择列R 是否可以基于变量标签选择列?,r,dplyr,tidyr,R,Dplyr,Tidyr,对于非常宽的数据集,是否可以使用变量标签来选择列 library(expss) data(mtcars) mtcars = apply_labels(mtcars, mpg = "Miles/(US) gallon", cyl = "Number of cylinders", disp = "Displacement
library(expss)
data(mtcars)
mtcars = apply_labels(mtcars,
mpg = "Miles/(US) gallon",
cyl = "Number of cylinders",
disp = "Displacement (cu.in.)",
hp = "Gross horsepower",
drat = "Rear axle ratio",
wt = "Weight (1000 lbs)",
qsec = "1/4 mile time",
vs = "Engine",
vs = c("V-engine" = 0,
"Straight engine" = 1),
am = "Transmission",
am = c("Automatic" = 0,
"Manual"=1),
gear = "Number of forward gears",
carb = "Number of carburetors"
)
mtcars %>%
select(contains("Miles"))
这不起作用,因为它在列名中查找。它可以看标签吗
编辑:除了将标签转换为列名之外,我应该添加。我们可以获得
属性
“标签”,检查“里程”
library(dplyr)
library(stringr)
mtcars %>%
select(where(~ str_detect(attributes(.)$label, 'Miles')))
-输出
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
#Merc 280C 17.8
#Merc 450SE 16.4
# ..
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
# ...
或者使用
baser
(使用r4.1.0
),使用lappy
循环列,提取标签
属性,使用grep
返回与模式
英里数匹配的元素,获取名称
,并在选择子集
的中使用
mtcars |>
lapply(\(x) attributes(x)$label) |>
grep(pattern = 'Miles', value = TRUE) |>
names() |>
{\(x) subset(mtcars, select = x)}()
-输出
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
#Merc 280C 17.8
#Merc 450SE 16.4
# ..
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
# ...
我们可以获得属性
“标签”,检查“里程”
library(dplyr)
library(stringr)
mtcars %>%
select(where(~ str_detect(attributes(.)$label, 'Miles')))
-输出
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
#Merc 280C 17.8
#Merc 450SE 16.4
# ..
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
# ...
或者使用baser
(使用r4.1.0
),使用lappy
循环列,提取标签
属性,使用grep
返回与模式
英里数匹配的元素,获取名称
,并在选择子集
的中使用
mtcars |>
lapply(\(x) attributes(x)$label) |>
grep(pattern = 'Miles', value = TRUE) |>
names() |>
{\(x) subset(mtcars, select = x)}()
-输出
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
#Merc 280C 17.8
#Merc 450SE 16.4
# ..
# mpg
#Mazda RX4 21.0
#Mazda RX4 Wag 21.0
#Datsun 710 22.8
#Hornet 4 Drive 21.4
#Hornet Sportabout 18.7
#Valiant 18.1
#Duster 360 14.3
#Merc 240D 24.4
#Merc 230 22.8
#Merc 280 19.2
# ...