Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/sql-server-2005/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Iconv::wget保存的文件名中的非法序列(Ruby 1.9.2)_Ruby_Utf 8_Find_Wget_Iconv - Fatal编程技术网

Iconv::wget保存的文件名中的非法序列(Ruby 1.9.2)

Iconv::wget保存的文件名中的非法序列(Ruby 1.9.2),ruby,utf-8,find,wget,iconv,Ruby,Utf 8,Find,Wget,Iconv,我正在用wget使用普通标志将文件保存到ext2分区。检索他们的姓名有时会失败: s = `find "#{@dir}" -type f -printf "%T@\t::::\t%s\t::::\t%p\n" |sort` s.each_line {|l| file_name = l.chomp.split("\t::::\t")[2] #=> # ...66:in `split': invalid byte sequence in UTF-8 (ArgumentError)

我正在用wget使用普通标志将文件保存到ext2分区。检索他们的姓名有时会失败:

s = `find "#{@dir}" -type f -printf "%T@\t::::\t%s\t::::\t%p\n" |sort`
s.each_line {|l|
   file_name = l.chomp.split("\t::::\t")[2] #=> 
   # ...66:in `split': invalid byte sequence in UTF-8 (ArgumentError)
}
测试:

l.encoding #=> UTF-8
l.valid_encoding #=> false
l.inspect #=> "...St. Paul\xE2%80%99s Cathedral..."
Iconv.conv('utf-8', 'utf-8', l) #=> 
# ...77:in `conv': "\xE2%80%99s Cathedr"... (Iconv::IllegalSequence)
如何获取文件名并删除该文件

忘了提及,在bash中,文件如下所示:

index.php?attTag=St. Paul?%80%99s Cathedral

将此字符串粘贴回ls不会返回此类文件或目录。

在运行转换之前,您可以尝试
CGI.unescape

a = "...St. Paul\xE2%80%99s Cathedral..."
puts a

require 'cgi'
b = CGI.unescape a
puts b

require 'iconv'
c = Iconv.conv('UTF-8//TRANSLIT', 'UTF-8', b) # may not even be necessary
puts c
在我的ruby-1.9.2-p180上有哪些输出:

...St. Paul?%80%99s Cathedral...
...St. Paul’s Cathedral...
...St. Paul’s Cathedral...

我必须先做“a.force_编码('us-ascii'),它成功了。谢谢。对不起,我太匆忙了。这不是bash中文件名的外观。我已经在我的问题中添加了这个。原因可能是wget中的一个故意错误--。