Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/file/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:读取以制表符分隔的文件时出现的问题_R_File_Dataframe - Fatal编程技术网

R:读取以制表符分隔的文件时出现的问题

R:读取以制表符分隔的文件时出现的问题,r,file,dataframe,R,File,Dataframe,为这个简单的问题提前道歉。我在读取以制表符分隔的文件时遇到问题。R认为第164行缺少元素,但我不明白为什么。当我复制并粘贴到Excel中时,它可以很好地分离 数据: temp <- tempfile() download.file("https://www.fda.gov/downloads/Drugs/InformationOnDrugs/UCM527389.zip",temp) temp考虑read.delim,它与read.csv类似,是内置utils包中更通用的read.

为这个简单的问题提前道歉。我在读取以制表符分隔的文件时遇到问题。R认为第164行缺少元素,但我不明白为什么。当我复制并粘贴到Excel中时,它可以很好地分离

数据:

  temp <- tempfile()
  download.file("https://www.fda.gov/downloads/Drugs/InformationOnDrugs/UCM527389.zip",temp)

temp考虑
read.delim
,它与
read.csv
类似,是内置
utils
包中更通用的
read.table
功能的包装之一

较长的字段DrugName和ActiveComponent似乎存在引号和空行问题,需要调整fill、quote、comment_char参数

df <- read.delim(unz(temp, "Products.txt"), sep="\t", header= TRUE) 
read.table
等效,调整参数中的默认值:

df <- read.table(unz(temp, "Products.txt"), sep="\t", quote = "\"", fill = TRUE,
                 comment.char = "", header= TRUE) 

df谢谢-我刚想出来。谢谢。
df <- read.delim(unz(temp, "Products.txt"), sep="\t", header= TRUE) 
str(df)
# 'data.frame': 37850 obs. of  8 variables:
#  $ ApplNo           : int  4 159 552 552 552 552 552 552 552 552 ...
#  $ ProductNo        : num  4 1 1 2 3 4 5 7 8 9 ...
#  $ Form             : Factor w/ 348 levels "AEROSOL, FOAM;RECTAL",..: 203 331 121 121 121 121 121 121 121 121 ...
#  $ Strength         : Factor w/ 4065 levels ""," EQ 5MG BASE/ML",..: 525 2491 1453 2240 2447 538 654 670 538 2447 ...
#  $ ReferenceDrug    : int  0 0 0 0 0 0 0 0 0 0 ...
#  $ DrugName         : Factor w/ 7161 levels "8-HOUR BAYER",..: 4773 6039 3547 3547 3547 3547 3547 3546 2796 2796 ...
#  $ ActiveIngredient : Factor w/ 2735 levels "ABACAVIR SULFATE",..: 1372 2446 1305 1305 1305 1305 1305 1305 1305 1305 ...
#  $ ReferenceStandard: int  0 0 0 0 0 0 0 0 0 0 ...
df <- read.table(unz(temp, "Products.txt"), sep="\t", quote = "\"", fill = TRUE,
                 comment.char = "", header= TRUE) 
df1 <- read.table(unz(temp, "Products.txt"), sep="\t", quote = "\"", fill = TRUE, 
                  comment.char = "", header= TRUE) 
df2 <- read.delim(unz(temp, "Products.txt"), sep="\t", header= TRUE) 

all.equal(df1, df2)
# [1] TRUE

identical(df1, df2)
# [1] TRUE