在ruby中，file.readlines.each不比file.open.each快，为什么？_Ruby_File_Io

在ruby中，file.readlines.each不比file.open.each快，为什么？

ruby file io

在ruby中，file.readlines.each不比file.open.each快，为什么？,ruby,file,io,Ruby,File,Io,只是为了分析我的iis日志（好处：碰巧知道iislog是用ASCII编码的，errrr..）这是我的ruby代码 1.readlines Dir.glob("*.log").each do |filename| File.readlines(filename,:encoding => "ASCII").each do |line| #comment line if line[0] == '#' next else line_conten

只是为了分析我的iis日志（好处：碰巧知道iislog是用ASCII编码的，errrr..）

这是我的ruby代码

1.readlines

Dir.glob("*.log").each do |filename|
  File.readlines(filename,:encoding => "ASCII").each do |line|
    #comment line
    if line[0] == '#'
      next
    else
      line_content = line.downcase
      #just care about first one
      matched_keyword = keywords.select { |e| line_content.include? e }[0]
      total_count += 1 if extensions.any? { |e| line_content.include? e }
      hit_count[matched_keyword] += 1 unless matched_keyword.nil?
    end
  end
end

2.开放式

Dir.glob("*.log").each do |filename|
  File.open(filename,:encoding => "ASCII").each_line do |line|
    #comment line
    if line[0] == '#'
      next
    else
      line_content = line.downcase
      #just care about first one
      matched_keyword = keywords.select { |e| line_content.include? e }[0]
      total_count += 1 if extensions.any? { |e| line_content.include? e }
      hit_count[matched_keyword] += 1 unless matched_keyword.nil?
    end
  end
end

相反，为什么“打开”总是快一点？？

我在Win7 Ruby1.9.3上对它进行了几次测试，

readlines

和

open。每行只读取一次文件。Ruby将在IO对象上进行缓冲，因此它将每次从磁盘读取一个块（例如64KB）的数据，以最小化磁盘读取的成本。在磁盘读取步骤中应该没有什么耗时的差别
调用readlines
时，Ruby构造一个空数组[]
，并重复读取一行文件内容并将其推送到数组中。最后它将返回包含文件所有行的数组
当您调用每行代码时，Ruby会读取一行文件内容并将其转换为您的逻辑。处理完这一行后，ruby将读取另一行。它重复读取行，直到文件中没有更多内容
这两种方法的区别在于readlines
必须将行附加到数组中。当文件较大时，Ruby可能必须复制底层数组（C级）以将其大小放大一倍或几倍
深入源代码，readlines
由调用的io\u s\u readlines
实现rb_io_readlines
调用rb_io_getline_1
获取行，并rb_ary_push
将结果推入返回数组
每一行
都是通过调用rb\u io\u getline\u 1
来实现的，以像readlines
一样提取行，并使用rb\u yield
将行转换为您的逻辑
因此，对于每行
，无需将行结果存储在不断增长的数组中，无需调整数组大小和复制问题