将LISP数据导入RapidMiner（CSV，…）_Csv_Lisp_Rapidminer

将LISP数据导入RapidMiner（CSV，…）

csv lisp

将LISP数据导入RapidMiner（CSV，…）,csv,lisp,rapidminer,Csv,Lisp,Rapidminer,我有LISP格式的数据，我需要在RapidMiner中处理它们。我对LISP和RapidMiner都是新手。RapidMiner不接受LISP（我想这是因为它是编程语言），所以我可能需要以某种方式将LISP格式转换为CSV或类似的格式。代码的小示例： (def-instance Adelphi (state newyork) (control private) (no-of-students thous:5-10) ...) (def-instance Arizona-S

我有LISP格式的数据，我需要在RapidMiner中处理它们。我对LISP和RapidMiner都是新手。RapidMiner不接受LISP（我想这是因为它是编程语言），所以我可能需要以某种方式将LISP格式转换为CSV或类似的格式。代码的小示例：

(def-instance Adelphi
   (state newyork)
   (control private)
   (no-of-students thous:5-10)
   ...)
(def-instance Arizona-State
   (state arizona)
   (control state)
   (no-of-students thous:20+)
   ...)
(def-instance Boston-College
   (state massachusetts)
   (location suburban)
   (control private:roman-catholic)
   (no-of-students thous:5-10)
   ...)

如果有任何建议，我将不胜感激。

您可以利用这样一个事实，即Lisp的解析器可供Lisp用户使用。此数据的一个问题是，某些值包含冒号，在Common Lisp中使用了包名分隔符。我编写了一些通用的Lisp代码来解决您的问题，但我必须通过定义适当的包来解决上述问题

下面是代码，当然必须对您在问题示例中遗漏的所有内容进行扩展（遵循已在其中使用的相同模式）：

(defpackage #:thous
  (:export #:5-10 #:20+))
(defpackage #:private
  (:export #:roman-catholic))

(defstruct (college (:conc-name nil))
  (name "")
  (state "")
  (location "")
  (control "")
  (no-of-students ""))

(defun data->college (name data)
  (let ((college (make-college :name (write-to-string name :case :capitalize))))
    (loop for (key value) in data
       for string = (remove #\| (write-to-string value :case :downcase))
       do (case key
            (state (setf (state college) string))
            (location (setf (location college) string))
            (control (setf (control college) string))
            (no-of-students (setf (no-of-students college) string))))
    college))

(defun read-data (stream)
  (loop for (def-instance name . data) = (read stream nil nil)
     while def-instance
     collect (data->college name data)))

(defun print-college-as-csv (college stream)
  (format stream
          "~a~{,~a~}~%"
          (name college)
          (list (state college)
                (location college)
                (control college)
                (no-of-students college))))

(defun data->csv (in out)
  (let ((header (make-college :name "College"
                              :state "state"
                              :location "location"
                              :control "control"
                              :no-of-students "no-of-students")))
    (print-college-as-csv header out)
    (dolist (college (read-data in))
      (print-college-as-csv college out))))

(defun data-file-to-csv (input-file output-file)
  (with-open-file (in input-file)
   (with-open-file (out output-file
                        :direction :output
                        :if-does-not-exist :create
                        :if-exists :supersede)
     (data->csv in out))))

主要功能是将数据文件转换为csv，加载此代码后，可以在公共Lisp REPL中使用

（数据文件转换为csv“输入文件路径”“输出文件路径”）

调用此功能

编辑：一些额外的想法

与使用冒号为所有值添加包定义相比，在数据上执行正则表达式搜索并替换以在所有值周围添加引号（“）实际上更容易。这将使Lisp立即将它们解析为字符串。在这种情况下，string=（remove#\ \ 124;（write to string value:case:downcase））的行

在case
语句的所有行中，可以删除并将字符串
替换为值

由于数据的高度规律性，实际上根本不需要正确解析Lisp定义。相反，您可以使用正则表达式提取数据。一种特别适合基于正则表达式的文本文件转换的语言，如AWK或Perl，应该适合这项工作。
是软件这通常适用于您可以使用的Lisp数据？如果您可以使用其def instance
定义，那么您就可以省去编写def instance
或自己解析这些调用的麻烦。我收到了这种格式的数据，我不知道它们来自何处。我的任务是将此Lisp方案转换为一些可以用RapidMiner处理的表数据。我真的不想自己编写一些解析器/转换器：）