R 按照规则“1”之前的第一个数字创建变量/&引用;

R 按照规则“1”之前的第一个数字创建变量/&引用;,r,R,我想收集一些数据。这就是我现在拥有的: library(XML) library(dplyr) theurl <- "http://www.iie.org/Research-and-Publications/Open-Doors/Data/International-Students/Enrollment-Trends/1948-2012" tables <- readHTMLTable(theurl) trends <- tables[[1]][3:67,] %>% r

我想收集一些数据。这就是我现在拥有的:

library(XML)
library(dplyr)
theurl <- "http://www.iie.org/Research-and-Publications/Open-Doors/Data/International-Students/Enrollment-Trends/1948-2012"
tables <- readHTMLTable(theurl)
trends <- tables[[1]][3:67,] %>% rename("International Students"=V2, "Annual % Change"=V3, "Total Enrollment"=V4, "% Int'l"=V5) %>% 
  mutate(Year = strsplit(x = as.character(V1), "/"))
库(XML)
图书馆(dplyr)

URL我不确定您是否希望使用列
V1
Year
,但这里有两种方法可以使用任一列:

# Using a Regular Expression: Search for the first instance of four numeric characters 
# in a row. Keep them and throw away everything else.
trends$Year = gsub("([0-9]{4}).*", "\\1", trends$Year)

# Using the substr function: Subset the first four characters in the string.
trends$Year = substr(trends$Year, 1, 4)