File.read的Ruby性能
给定以下脚本:File.read的Ruby性能,ruby,string,performance,io,Ruby,String,Performance,Io,给定以下脚本: require 'rubygems' require 'open-uri' require 'benchmark' response = open('http://gdata.youtube.com/feeds/api/videos?q=skateboarding+dog') outside = Benchmark.measure do response_body = response.read 10000.times do response_body.sca
require 'rubygems'
require 'open-uri'
require 'benchmark'
response = open('http://gdata.youtube.com/feeds/api/videos?q=skateboarding+dog')
outside = Benchmark.measure do
response_body = response.read
10000.times do
response_body.scan(/dog/)
end
end
inside = Benchmark.measure do
10000.times do
response.read.scan(/dog/)
end
end
puts [outside, inside].map(&:utime).inspect
我得到以下结果:
[1.25, 0.06000000000000005]
为什么每次读取文件的性能会提高20倍
如果我的系统信息很重要:
ruby 2.0.0p247 (2013-06-27 revision 41674) [x86_64-darwin12.4.0]
这是因为在第一次测试之后,
response
被读取到最后,而在第二次测试的每次迭代中,read
的结果都是微不足道的,这节省了时间,而且它也只返回空字符串。因此,扫描也很快结束
irb> response.read.scan(/dog/)
=> ["dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog"]
irb> response.read.scan(/dog/)
=> []
这是因为在第一次测试之后,response
被读取到最后,而在第二次测试的每次迭代中,read
的结果都是微不足道的,这节省了时间,而且它也只返回空字符串。因此,扫描也很快结束
irb> response.read.scan(/dog/)
=> ["dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog", "dog"]
irb> response.read.scan(/dog/)
=> []
open('http://gdata.youtube.com/feeds/api/videos?q=skateboarding+狗)。读取。扫描(/dog/)=>[“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“狗”、“,“dog”、“dog”、“dog”、“dog”、“dog”、“dog”]
@Kaleidoscope关键在于read
移动响应的读取指针(它充当流)。当您第二次调用read
时,它返回空字符串。open('http://gdata.youtube.com/feeds/api/videos?q=skateboarding+dog').read.scan(/dog/)=>[“dog”狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗、狗
@Kaleidoscope关键是read
移动响应的读取指针(作为流)。当您第二次调用read
时,它返回空字符串。