Recursion 带尾部递归的慢字节码

Recursion 带尾部递归的慢字节码,recursion,ocaml,performance-testing,tail-recursion,Recursion,Ocaml,Performance Testing,Tail Recursion,受答案的启发,我使用代码检查命令循环与尾部递归: let rec nothingfunc i = match i with | 1000000000 -> 1 | _ -> nothingfunc (i+1) let nothingloop1 () = let i = ref 0 in while !i < 1000000000 do incr i done; 1 let timeit f v = let t1 = Unix.gettimeo

受答案的启发,我使用代码检查命令循环与尾部递归:

let rec nothingfunc i =
  match i with
  | 1000000000 -> 1
  | _ -> nothingfunc (i+1)

let nothingloop1 () =
  let i = ref 0 in
   while !i < 1000000000 do incr i done;
   1

let timeit f v =
  let t1 = Unix.gettimeofday() in
  let _ = f v in
  let t2 =  Unix.gettimeofday() in
    t2 -. t1

let () =
  Printf.printf "recursive function: %g s\n%!" (timeit nothingfunc 0);
  Printf.printf "while loop with ref counter buitin incr: %g s\n%!" (timeit nothingloop1 ());
问题是:20到12秒的执行时间差异很大的原因是什么

编辑,我的结论:

函数调用
apply
(在字节码中)涉及堆栈大小检查、可能的堆栈放大和信号检查。为了获得最大性能,本机代码编译器将提供


(旁注:在这里这样问是因为它对搜索引擎友好。)

查看
ocamlfind ocamlc-package unix test.ml-dlambda的输出

(nothingloop1/1010 =
     (function param/1022
       (let (i/1011 =v 0)
         (seq (while (< i/1011 100000000) (assign i/1011 (1+ i/1011))) 1)))

(nothingfunc/1008
   (function i/1009
     (if (!= i/1009 100000000) (apply nothingfunc/1008 (+ i/1009 1)) 1)))
(nothingloop1/1010=
(功能参数/1022
(设(i/1011=v0)
(虽然(

因此,显然
assign
apply
快。在函数调用时,似乎会检查堆栈溢出和信号,但不是简单的赋值。有关详细信息,您必须查看:

opt-in-ocamlopt代表优化。字节码编译器执行的优化较少,因为这从来都不是它的目的。尽管y优化仍在进行。例如,在当前版本的编译器(4.03)上,差异约为10%(9.3 vs 8.3秒)。好的,找到了。这看起来像一个Forth解释器。
(nothingloop1/1010 =
     (function param/1022
       (let (i/1011 =v 0)
         (seq (while (< i/1011 100000000) (assign i/1011 (1+ i/1011))) 1)))

(nothingfunc/1008
   (function i/1009
     (if (!= i/1009 100000000) (apply nothingfunc/1008 (+ i/1009 1)) 1)))