String 大小爆炸文件与字符串

String 大小爆炸文件与字符串,string,common-lisp,sbcl,String,Common Lisp,Sbcl,我得到了一个261MB的文本文件(xdebug输出),当我在其中读取它时,它会占用额外的2GB空间 (defun stream->string (tmp-stream) (do ((line (read-line tmp-stream nil nil) (read-line tmp-stream nil nil)) (lines nil)) ((not line) (progn (FORMAT

我得到了一个261MB的文本文件(xdebug输出),当我在其中读取它时,它会占用额外的2GB空间

(defun stream->string (tmp-stream)
  (do ((line (read-line tmp-stream nil nil)
             (read-line tmp-stream nil nil))
       (lines nil))
      ((not line) (progn 
                    (FORMAT T "COLLECTED~%")
                    (FORMAT nil "~{~a~^~%~}" (reverse lines))))
    (push line lines)))


(defparameter *test* nil)

  (progn
    (setf *test* nil)
    (sb-ext:gc :full t)
    (room)
    (FORMAT T "----~%")
    (with-open-file (stream "/home/.../debugFiles/xdebug_1.xt")
      (room)
      (FORMAT T "----~%")
      (setf *test* (stream->string stream))
      (sb-ext:gc :full t)
      (room)
      (FORMAT T "----~%"))
    (sb-ext:gc :full t)
    (room))  
输出

Dynamic space usage is:   84,598,224 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,408 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  20,841,808 bytes for    20,691 code objects.
  15,989,600 bytes for   999,350 cons objects.
  14,532,960 bytes for   118,880 simple-vector objects.
  13,951,792 bytes for   168,301 instance objects.
   5,994,864 bytes for    41,648 simple-character-string objects.
  13,287,200 bytes for   215,901 other objects.
  84,598,224 bytes for 1,564,771 dynamic objects (space total.)
----
Dynamic space usage is:   85,346,752 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,536 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  20,842,928 bytes for    20,692 code objects.
  16,125,008 bytes for 1,007,813 cons objects.
  14,698,784 bytes for   120,834 simple-vector objects.
  14,239,440 bytes for   171,411 instance objects.
   6,014,144 bytes for    41,776 simple-character-string objects.
  13,426,448 bytes for   219,723 other objects.
  85,346,752 bytes for 1,582,249 dynamic objects (space total.)
----
COLLECTED
Dynamic space usage is:   2,557,851,296 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,536 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  2,466,544,480 bytes for   817,255 simple-character-string objects.
  91,306,816 bytes for 2,303,370 other objects.
  2,557,851,296 bytes for 3,120,625 dynamic objects (space total.)
----
Dynamic space usage is:   1,131,069,056 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,360 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  1,053,183,424 bytes for    41,547 simple-character-string objects.
  77,885,632 bytes for 1,510,521 other objects.
  1,131,069,056 bytes for 1,552,068 dynamic objects (space total.)
我可以理解三倍的尺寸(尽管这仍然会让我感到惊讶):

  • 行的集合
  • 按格式创建的字符串对象
  • 保存在
    *test*
  • 然而,一个因子10的增长是一个巨大的过程


    这是怎么回事?

    另请参见您的代码是否已编译?在打开文件离开
    后,最好使用GC并调用
    (房间)
    。有可能流对象保留着数据。@Barmar它是否被编译有什么区别?但是,我尝试将代码移动到函数中并编译它-行为上没有区别。我还使用您的建议改进了我的代码。请注意更新的问题。@RainerJoswig如何创建
    简单基字符串
    ,因为这似乎是字符串的节省空间版本。我当前的代码只生成将一个字符保存为32位的数组,即实际所需大小的x4<代码>(typep*test*'简单基本字符串)
    yiels-如预期的那样-
    nil