在haven::read_sav()中的文件路径中包含特殊字符

在haven::read_sav()中的文件路径中包含特殊字符,r,r-haven,R,R Haven,当在文件路径中包含任何类型的特殊字符(仅包括文件名)时,haven(1.1.1)软件包都会出现这种情况 假设这是一个真正的问题,我正在寻找某种简洁的破解/解决方案来解决它 一个(不理想的)例子是让R将文件的副本放到一个更友好的路径中,并给它一个“更好”的文件名,然后用haven加载。例如: setwd("c:/temp") fn <- "randóóm.sav" file.copy(paste0("./äglæpath/", fn), fn) file.rename(fn, gsub("[

当在文件路径中包含任何类型的特殊字符(仅包括文件名)时,haven(1.1.1)软件包都会出现这种情况

假设这是一个真正的问题,我正在寻找某种简洁的破解/解决方案来解决它

一个(不理想的)例子是让R将文件的副本放到一个更友好的路径中,并给它一个“更好”的文件名,然后用haven加载。例如:

setwd("c:/temp")
fn <- "randóóm.sav"
file.copy(paste0("./äglæpath/", fn), fn)
file.rename(fn, gsub("[^-\\./a-zA-Z0-9[:space:]]", "", fn))
# now apply read_sav() to the copy

不幸的是,我已经能够在Windows 10上用标准版本的
haven
devtools
版本的
haven
重现这个问题。这似乎是已知的haven bug

建议解决方法: 将文件移动到文件路径或文件名中没有德语Umlauts的目录。因此,您的变通方法可以按所述方式工作

> file.path(dataFilepath, dtaFilename)
[1] "äglæpath/randóóm.dta"

> dtaFilename <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", dtaFilename)
> bdatFilename <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", bdatFilename)
> savFilename <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", savFilename)
> dataFilepath <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", dataFilepath)

> file.path(dataFilepath, dtaFilename)
[1] "glpath/randm.dta"

> # Stata
> read_dta(dtaDest)
# A tibble: 150 x 5
   sepallength sepalwidth petallength petalwidth species
         <dbl>      <dbl>       <dbl>      <dbl> <chr>  
 1        5.10       3.5         1.40      0.200 setosa 
 2        4.90       3           1.40      0.200 setosa 
 3        4.70       3.20        1.30      0.200 setosa 
 4        4.60       3.10        1.5       0.200 setosa 
 5        5          3.60        1.40      0.200 setosa 
 6        5.40       3.90        1.70      0.400 setosa 
 7        4.60       3.40        1.40      0.300 setosa 
 8        5          3.40        1.5       0.200 setosa 
 9        4.40       2.90        1.40      0.200 setosa 
10        4.90       3.10        1.5       0.100 setosa 
# ... with 140 more rows
> 
控制台输出
>要求(避风港)
>要求(stringi)
>如果(!dir.exists(dataFilepath))存在,则dtaURL bdatURL savURL dtaFilename bdatFilename savFilename DATAFILEMATH{
+dir.create(file.path(dataFilepath),showWarnings=TRUE)
+ }
>dtaDest=file.path(dataFilepath,dtaFilename)
>bdatDest=file.path(dataFilepath,bdatFilename)
>savDest=file.path(dataFilepath,savFilename)
>download.file(dtaURL,destfile=dtaDest,method=“wget”,mode=“wb”)
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/blob/master/inst/examples/iris.dta?raw=true
正在解析github.com(github.com)。。。192.30.255.113, 192.30.255.112
连接到github.com(github.com)| 192.30.255.113 |:443。。。有联系的。
HTTP请求已发送,正在等待响应。。。302发现
地点:https://github.com/tidyverse/haven/raw/master/inst/examples/iris.dta [以下]
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/raw/master/inst/examples/iris.dta
重用到github.com的现有连接:443。
HTTP请求已发送,正在等待响应。。。302发现
地点:https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.dta [以下]
--2018-05-29 15:56:59--  https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.dta
正在解析raw.githubusercontent.com(raw.githubusercontent.com)。。。151.101.52.133
连接到raw.githubusercontent.com(raw.githubusercontent.com)| 151.101.52.133 |:443。。。有联系的。
HTTP请求已发送,正在等待响应。。。200行
长度:8213(8.0K)[应用程序/八位字节流]
保存到:'\344gl\346path/rand\363\363m.dta'
0K。。。。。。。。100%1.56M=0.005s
2018-05-29 15:56:59(1.56 MB/s)-“\344gl\346path/rand\363\363m.dta”已保存[8213/8213]
>下载.file(bdatarl,destfile=bdatDest,method=“wget”,mode=“wb”)
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sas7bdat?raw=true
正在解析github.com(github.com)。。。192.30.255.113, 192.30.255.112
连接到github.com(github.com)| 192.30.255.113 |:443。。。有联系的。
HTTP请求已发送,正在等待响应。。。302发现
地点:https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sas7bdat [以下]
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sas7bdat
重用到github.com的现有连接:443。
HTTP请求已发送,正在等待响应。。。302发现
地点:https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sas7bdat [以下]
--2018-05-29 15:56:59--  https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sas7bdat
正在解析raw.githubusercontent.com(raw.githubusercontent.com)。。。151.101.52.133
连接到raw.githubusercontent.com(raw.githubusercontent.com)| 151.101.52.133 |:443。。。有联系的。
HTTP请求已发送,正在等待响应。。。200行
长度:131072(128K)[应用程序/八位字节流]
保存到:'\344gl\346path/rand\363\363m.bdata'
0K。。。。。。。。。。39%4050万
5万。。。。。。。。。。78%1970万
10万。。。。。。。。100%19.3M=0.02s
2018-05-29 15:57:00(7.83 MB/s)-“\344gl\346path/rand\363\363m.bdata”已保存[131072/131072]
>下载.file(savURL,destfile=savDest,method=“wget”,mode=“wb”)
--2018-05-29 15:57:01--  https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sav?raw=true
正在解析github.com(github.com)。。。192.30.255.113, 192.30.255.112
连接到github.com(github.com)| 192.30.255.113 |:443。。。有联系的。
HTTP请求已发送,正在等待响应。。。302发现
地点:https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sav [以下]
--2018-05-29 15:57:01--  https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sav
重用到github.com的现有连接:443。
HTTP请求已发送,正在等待响应。。。302发现
地点:https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sav [以下]
--2018-05-29 15:57:01--  https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sav
正在解析raw.githubusercontent.com(raw.githubusercontent.com)。。。151.101.52.133
连接到raw.githubusercontent.com(raw.githubusercontent.com)| 151.101.52.133 |:443。。。有联系的。
HTTP请求已发送,正在等待响应。。。200行
长度:6690(6.5K)[应用程序/八位字节流]
保存到:'\344gl\346path/rand\363\363m.sav'
0K。。。。。。100%3.09M=0.002s
2018-05-29 15:57:01(3.09 MB/s)-“\344gl\346path/rand\363\363m.sav”已保存[6690/6690]
>#斯塔塔
>读取dta(dtaDest)
df_parse_dta_文件(规范、编码)中出错:
无法解析/glÃalpath/randÃÃÃóm.dta:无法打开文件。

我无法复制您的问题-我将文件保存在as
äglæpath.sav
-
read_sav
读取时没有error@CPak有趣。我尝试了同样的方法,但未能解析c:/temp/glÃalpath.sav:无法打开文件Sys.getlocale()说什么?@CJYetman LCÃCOLLATE=IcelandicÃIcelan
> file.path(dataFilepath, dtaFilename)
[1] "äglæpath/randóóm.dta"

> dtaFilename <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", dtaFilename)
> bdatFilename <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", bdatFilename)
> savFilename <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", savFilename)
> dataFilepath <- gsub("[^-\\./a-zA-Z0-9[:space:]]", "", dataFilepath)

> file.path(dataFilepath, dtaFilename)
[1] "glpath/randm.dta"

> # Stata
> read_dta(dtaDest)
# A tibble: 150 x 5
   sepallength sepalwidth petallength petalwidth species
         <dbl>      <dbl>       <dbl>      <dbl> <chr>  
 1        5.10       3.5         1.40      0.200 setosa 
 2        4.90       3           1.40      0.200 setosa 
 3        4.70       3.20        1.30      0.200 setosa 
 4        4.60       3.10        1.5       0.200 setosa 
 5        5          3.60        1.40      0.200 setosa 
 6        5.40       3.90        1.70      0.400 setosa 
 7        4.60       3.40        1.40      0.300 setosa 
 8        5          3.40        1.5       0.200 setosa 
 9        4.40       2.90        1.40      0.200 setosa 
10        4.90       3.10        1.5       0.100 setosa 
# ... with 140 more rows
> 
require(haven)
require(stringi)

dtaURL  <- "https://github.com/tidyverse/haven/blob/master/inst/examples/iris.dta?raw=true"
bdatURL <- "https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sas7bdat?raw=true"
savURL  <- "https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sav?raw=true"

dtaFilename   <- "randóóm.dta"
bdatFilename <- "randóóm.bdata"
savFilename   <- "randóóm.sav"

dataFilepath      <- "äglæpath"

if (!dir.exists(dataFilepath)) {
  dir.create(file.path(dataFilepath), showWarnings = TRUE)
}

dtaDest = file.path(dataFilepath, dtaFilename)
bdatDest = file.path(dataFilepath, bdatFilename )
savDest = file.path(dataFilepath, savFilename )

download.file(dtaURL, destfile = dtaDest, method = "wget", mode = "wb")
download.file(bdatURL, destfile = bdatDest, method = "wget", mode = "wb")
download.file(savURL, destfile = savDest, method = "wget", mode = "wb")


# Stata
read_dta(dtaDest)

# SAS
read_sas(bdatDest)

# SPSS
read_sav(savDest)
> require(haven)
> require(stringi)
> dtaURL  <- "https://github.com/tidyverse/haven/blob/master/inst/examples/iris.dta?raw=true"
> bdatURL <- "https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sas7bdat?raw=true"
> savURL  <- "https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sav?raw=true"
> dtaFilename   <- "randóóm.dta"
> bdatFilename <- "randóóm.bdata"
> savFilename   <- "randóóm.sav"
> dataFilepath      <- "äglæpath"
> if (!dir.exists(dataFilepath)) {
+   dir.create(file.path(dataFilepath), showWarnings = TRUE)
+ }
> dtaDest = file.path(dataFilepath, dtaFilename)
> bdatDest = file.path(dataFilepath, bdatFilename )
> savDest = file.path(dataFilepath, savFilename )
> download.file(dtaURL, destfile = dtaDest, method = "wget", mode = "wb")
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/blob/master/inst/examples/iris.dta?raw=true
Resolving github.com (github.com)... 192.30.255.113, 192.30.255.112
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/tidyverse/haven/raw/master/inst/examples/iris.dta [following]
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/raw/master/inst/examples/iris.dta
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.dta [following]
--2018-05-29 15:56:59--  https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.dta
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.52.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.52.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8213 (8.0K) [application/octet-stream]
Saving to: '\344gl\346path/rand\363\363m.dta'

     0K ........                                              100% 1.56M=0.005s

2018-05-29 15:56:59 (1.56 MB/s) - '\344gl\346path/rand\363\363m.dta' saved [8213/8213]

> download.file(bdatURL, destfile = bdatDest, method = "wget", mode = "wb")
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sas7bdat?raw=true
Resolving github.com (github.com)... 192.30.255.113, 192.30.255.112
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sas7bdat [following]
--2018-05-29 15:56:59--  https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sas7bdat
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sas7bdat [following]
--2018-05-29 15:56:59--  https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sas7bdat
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.52.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.52.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 131072 (128K) [application/octet-stream]
Saving to: '\344gl\346path/rand\363\363m.bdata'

     0K .......... .......... .......... .......... .......... 39% 4.05M 0s
    50K .......... .......... .......... .......... .......... 78% 19.7M 0s
   100K .......... .......... ........                        100% 19.3M=0.02s

2018-05-29 15:57:00 (7.83 MB/s) - '\344gl\346path/rand\363\363m.bdata' saved [131072/131072]

> download.file(savURL, destfile = savDest, method = "wget", mode = "wb")
--2018-05-29 15:57:01--  https://github.com/tidyverse/haven/blob/master/inst/examples/iris.sav?raw=true
Resolving github.com (github.com)... 192.30.255.113, 192.30.255.112
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sav [following]
--2018-05-29 15:57:01--  https://github.com/tidyverse/haven/raw/master/inst/examples/iris.sav
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sav [following]
--2018-05-29 15:57:01--  https://raw.githubusercontent.com/tidyverse/haven/master/inst/examples/iris.sav
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.52.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.52.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6690 (6.5K) [application/octet-stream]
Saving to: '\344gl\346path/rand\363\363m.sav'

     0K ......                                                100% 3.09M=0.002s

2018-05-29 15:57:01 (3.09 MB/s) - '\344gl\346path/rand\363\363m.sav' saved [6690/6690]

> # Stata
> read_dta(dtaDest)
Error in df_parse_dta_file(spec, encoding) : 
  Failed to parse <...>/äglæpath/randóóm.dta: Unable to open file.