Mysql 在R中使用read.csv和read.table读取数据时出现问题
我有一个使用以下命令从mysql导出的数据Mysql 在R中使用read.csv和read.table读取数据时出现问题,mysql,r,csv,Mysql,R,Csv,我有一个使用以下命令从mysql导出的数据 SELECT id_code,info_text INTO OUTFILE '/tmp/company-desc.csv' FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM dx_company WHERE LENGTH(id_code) = 8 AND id_code REGEX
SELECT
id_code,info_text INTO OUTFILE '/tmp/company-desc.csv'
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM
dx_company WHERE LENGTH(id_code) = 8 AND
id_code REGEXP '^[0-9]+$';
但是当我尝试在R中使用以下命令加载csv时
dt.companydesc <- read.csv("company-desc.csv",sep=';',fill=T, encoding = "UTF-8",quote="\n",header=FALSE)
有些id与描述混淆。
它在阅读时基本上存在引号和\n问题。如果我想把这两个都给我,我会打乱整张桌子。
我还尝试了gsub和readLines。
任何帮助
快照:CSV文件
"1000004";"general"
"1000000";"licensed version, and products"
"1000007";""
"1000003";""
"1000002";""
"1000006";""
"1000002";"automobiles; well organised"
期望输出:
Id_code Description
1000004 general
1000000 licensed version, and products
1000007 NA
1000003 NA
1000002 NA
1000006 NA
1000002 automobiles and industry; well organised
下面是一种使用data.table::fread的方法,它也更快:
require(data.table) # v1.9.6+
fread(' "1000004";"general"
"1000000";"licensed version, and products"
"1000007";""
"1000003";""
"1000002";""
"1000006";""
"1000002";"automobiles; well organised"', na.strings="",
header=FALSE, col.names=c("Id_code", "Description"))
# Id_code Description
# 1: 1000004 general
# 2: 1000000 licensed version, and products
# 3: 1000007 NA
# 4: 1000003 NA
# 5: 1000002 NA
# 6: 1000006 NA
# 7: 1000002 automobiles; well organised
随预期输出一起发布一个示例。我猜您的quote参数不正确,但如果没有看到CSV文件的示例,我无法确定。quote=\n有点不可见。您对MySQL说分隔符是逗号,然后使用;当您调用read.csv时。你确定吗?抱歉,我更正了mysql语句,我实际上尝试了使用不同的分隔符。我可以使用read.csv2轻松读取快照。他们可能应该避免CSV步骤,并从R查询数据库。
Id_code Description
1000004 general
1000000 licensed version, and products
1000007 NA
1000003 NA
1000002 NA
1000006 NA
1000002 automobiles and industry; well organised
require(data.table) # v1.9.6+
fread(' "1000004";"general"
"1000000";"licensed version, and products"
"1000007";""
"1000003";""
"1000002";""
"1000006";""
"1000002";"automobiles; well organised"', na.strings="",
header=FALSE, col.names=c("Id_code", "Description"))
# Id_code Description
# 1: 1000004 general
# 2: 1000000 licensed version, and products
# 3: 1000007 NA
# 4: 1000003 NA
# 5: 1000002 NA
# 6: 1000006 NA
# 7: 1000002 automobiles; well organised