R 读取逗号过多的csv文件

R 读取逗号过多的csv文件,r,csv,R,Csv,也许这很简单,但我有一个csv文件,文件中有很多逗号,R无法正确读取,它将所有数据放在第一列,而不是以表格的形式显示 你知道我怎样才能正确地阅读经典的csv文件吗 删除csv本身中的第1行,-3238-241 使用里约图书馆。 数据=导入(“文件名”) 数据中有很多空行。所以你可以用 data[rowSums(is.na(data))==0,]嗯,清理文件需要一些时间,幸运的是我有时间 gdp2012 <- read.csv("getdata_data_GDP.csv", stringsA

也许这很简单,但我有一个
csv
文件,文件中有很多逗号,R无法正确读取,它将所有数据放在第一列,而不是以表格的形式显示

你知道我怎样才能正确地阅读经典的
csv
文件吗


删除csv本身中的第1行,-3238-241

使用里约图书馆。 数据=导入(“文件名”)

数据中有很多空行。所以你可以用
data[rowSums(is.na(data))==0,]

嗯,清理文件需要一些时间,幸运的是我有时间

gdp2012 <- read.csv("getdata_data_GDP.csv", stringsAsFactors = FALSE)
cnames <- gdp2012[3, ]
names(cnames) <- NULL
cnames[1] <- "Abbrev"
cnames[5] <- "Millions.USD"
names(gdp2012) <- cnames
names(gdp2012)
[1] "Abbrev"       "Ranking"      "NA"           "Economy"          "Millions.USD"
[6] ""             "NA"           "NA"           "NA"           "NA"

gdp2012 <- gdp2012[, -grep("NA", names(gdp2012))]
gdp2012 <- gdp2012[, -ncol(gdp2012)]
gdp2012 <- gdp2012[-c(1:4, 237:nrow(gdp2012)), ]

dim(gdp2012)
[1] 232   4

str(gdp2012)
'data.frame':   232 obs. of  4 variables:
 $ Abbrev      : chr  "USA" "CHN" "JPN" "DEU" ...
 $ Ranking     : chr  "1" "2" "3" "4" ...
 $ Economy     : chr  "United States" "China" "Japan" "Germany" ...
 $ Millions.USD: chr  " 16,244,600 " " 8,227,103 " " 5,959,718 " "  3,428,131 " ...

gdp2012[[4]] <- as.numeric(gsub(",", "", gdp2012[[4]]))
Warning message:
NAs introduced by coercion

gdp2012[[2]] <- as.numeric(gdp2012[[2]])

head(gdp2012)
   Abbrev Ranking        Economy Millions.USD
5     USA       1  United States     16244600
6     CHN       2          China      8227103
7     JPN       3          Japan      5959718
8     DEU       4        Germany      3428131
9     FRA       5         France      2612878
10    GBR       6 United Kingdom      2471784
gdp2012使用data.table、fread函数,该函数有许多导入参数,速度更快:

library(data.table)

res <- fread("myFile.csv",
             sep = ",", # separator is comma
             skip = 5,  # skip first 5 rows
             select = c(1, 2, 4, 5), # select columns by index
             na.strings = c("", ".."), # convert blanks to NA
             # set column names
             col.names = c("Country", "Ranking", "Economy", "USD_Mln"))

# remove blank rows
res <- res[ !is.na(Country), ]

# convert character numbers to numbers
res[ , USD_Mln := as.numeric(gsub(",", "", USD_Mln))]

head(res)
#    Country Ranking        Economy  USD_Mln
# 1:     USA       1  United States 16244600
# 2:     CHN       2          China  8227103
# 3:     JPN       3          Japan  5959718
# 4:     DEU       4        Germany  3428131
# 5:     FRA       5         France  2612878
# 6:     GBR       6 United Kingdom  2471784
库(data.table)

欢迎来到Stackoverflow。请看一看。请分享代码和部分数据。非常感谢!!我从r开始,这种文件对我来说不容易。。。对我来说现在还是周五,工作时间。。。哈哈
library(data.table)

res <- fread("myFile.csv",
             sep = ",", # separator is comma
             skip = 5,  # skip first 5 rows
             select = c(1, 2, 4, 5), # select columns by index
             na.strings = c("", ".."), # convert blanks to NA
             # set column names
             col.names = c("Country", "Ranking", "Economy", "USD_Mln"))

# remove blank rows
res <- res[ !is.na(Country), ]

# convert character numbers to numbers
res[ , USD_Mln := as.numeric(gsub(",", "", USD_Mln))]

head(res)
#    Country Ranking        Economy  USD_Mln
# 1:     USA       1  United States 16244600
# 2:     CHN       2          China  8227103
# 3:     JPN       3          Japan  5959718
# 4:     DEU       4        Germany  3428131
# 5:     FRA       5         France  2612878
# 6:     GBR       6 United Kingdom  2471784