R 如何从列的每一行中删除某些字符

R 如何从列的每一行中删除某些字符,r,R,我有这个数据表: Year GDP 1998–99 <U+20B9>1,668,739 1999–00 <U+20B9>1,858,205 2000–01 <U+20B9>2,000,743 2001–02 <U+20B9>2,175,260 2002–03 <U+20B9>2,343,864 2003–04 <U+20B9>2,625,819 2004–05 <U+20B9>2,971,464 2005–

我有这个数据表:

Year    GDP
1998–99 <U+20B9>1,668,739
1999–00 <U+20B9>1,858,205
2000–01 <U+20B9>2,000,743
2001–02 <U+20B9>2,175,260
2002–03 <U+20B9>2,343,864
2003–04 <U+20B9>2,625,819
2004–05 <U+20B9>2,971,464
2005–06 <U+20B9>3,390,503
2006–07 <U+20B9>3,953,276
2007–08 <U+20B9>4,582,086
2008–09 <U+20B9>5,303,567
2009–10 <U+20B9>6,108,903
2010–11 <U+20B9>7,248,860
2011–12 <U+20B9>8,391,691
2012–13 <U+20B9>9,388,876
这些不适合我

我想要的是这样的东西:

Year    GDP
1998–99 1,668,739
1999–00 1,858,205
2000–01 2,000,743
2001–02 2,175,260
2002–03 2,343,864
2003–04 2,625,819
2004–05 2,971,464
2005–06 3,390,503
2006–07 3,953,276
2007–08 4,582,086
2008–09 5,303,567
2009–10 6,108,903
2010–11 7,248,860
2011–12 8,391,691
2012–13 9,388,876
我也尝试使用下面的解决方案,但也不适用于我。。。


xa data.table尝试使用一些示例数据

data <- setDT(data.frame(
 Year=c('1998–99', 
     '1999–00', 
     '2000–01', 
     '2001–02', 
     '2002–03', 
     '2003–04', 
     '2004–05', 
     '2005–06', 
     '2006–07', 
     '2007–08'),
 GDP=c('<U+20B9>1,668,739',
    '<U+20B9>1,858,205',
    '<U+20B9>2,000,743',
    '<U+20B9>2,175,260',
    '<U+20B9>2,343,864',
    '<U+20B9>2,625,819',
    '<U+20B9>2,971,464',
    '<U+20B9>3,390,503',
    '<U+20B9>3,953,276',
    '<U+20B9>4,582,086')))

data[,GDP:=sub("^\\s*<U\\+\\w+>\\s*",'',data$GDP)]
数据,然后\\s*删除所有空白


上述问题的最小答案是:

df$GDP <- substring(df$GDP, 2)

df$GDP可能与我找到了一个解决方案
df$GDP我也找到了一个解决方案
df$GDP这不起作用。示例数据使用的是文本字符串
,这就是R表示(但不是存储)unicode字符的方式。(例如:键入
“\u20b9”
)因此,
sub
对文本
进行加密不起作用。
data <- setDT(data.frame(
 Year=c('1998–99', 
     '1999–00', 
     '2000–01', 
     '2001–02', 
     '2002–03', 
     '2003–04', 
     '2004–05', 
     '2005–06', 
     '2006–07', 
     '2007–08'),
 GDP=c('<U+20B9>1,668,739',
    '<U+20B9>1,858,205',
    '<U+20B9>2,000,743',
    '<U+20B9>2,175,260',
    '<U+20B9>2,343,864',
    '<U+20B9>2,625,819',
    '<U+20B9>2,971,464',
    '<U+20B9>3,390,503',
    '<U+20B9>3,953,276',
    '<U+20B9>4,582,086')))

data[,GDP:=sub("^\\s*<U\\+\\w+>\\s*",'',data$GDP)]
df$GDP <- substring(df$GDP, 2)