Parsing REXX-从CSV文件解析
我在解析文本文件中的CSV时遇到问题,不知道你们是否可以帮助我。到目前为止,我有以下几点: CSV文件(DATA.txt)看起来像这样,它总是有15个字段,所有字段都用逗号分隔。并非所有字段都是必填字段,因此有些字段将被填写,有些字段为空Parsing REXX-从CSV文件解析,parsing,csv,rexx,Parsing,Csv,Rexx,我在解析文本文件中的CSV时遇到问题,不知道你们是否可以帮助我。到目前为止,我有以下几点: CSV文件(DATA.txt)看起来像这样,它总是有15个字段,所有字段都用逗号分隔。并非所有字段都是必填字段,因此有些字段将被填写,有些字段为空 Seattle,Lastname,Firstname,DOB,SEX,etc,etc Seattle,Lastname,Firstname,DOB,,etc,etc Portland,Lastname,Firstname,DOB,SEX,,,etc Portl
Seattle,Lastname,Firstname,DOB,SEX,etc,etc
Seattle,Lastname,Firstname,DOB,,etc,etc
Portland,Lastname,Firstname,DOB,SEX,,,etc
Portland,Lastname,Firstname,DOB,SEX,etc,etc
这是我的REXX代码
SOURCEFILE = "C:\DATA\DATA.TXT"
IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)
PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
CALL SETCURSOR 4,23
CALL CREATEDATA
END
CREATEDATA:
CALL TYPE CITY
CALL PRESS TAB
CALL TYPE LAST_NAME
CALL PRESS TAB
CALL TYPE DATE(U)
CALL PRESS TAB
CALL TYPE FIRST_NAME
CALL PRESS TAB
CALL PRESS ENTER
RETURN
我不确定在解析时是否应该使用ARG或VAR,或者是否正确地编写了前两行。我知道我的CREATEDATA函数工作正常,因为我输入的是“CITY”,而不是解析后的值。任何帮助都将不胜感激。谢谢大家! 一些评论:
1) 行(SourceFile)
在Windows系统上,可能需要读取整个文件来计算CR-LF序列。然后您的解析值行(SourceFile)
循环再次读取它。执行此操作的典型Rexx方法是:
Address SYSTEM 'TYPE' SourceFile with output stem Lines.
Do Counter = 1 to Lines.0
Parse var Lines.Counter ...
End
Drop Lines.
至少,只要文件不太大,以至于将其保存在阵列中会占用大量内存
2) 在循环结束时,您将进入CreateData
,这就是您看到“城市”的原因。在结束
指令之后,您需要返回
或退出
3) 根据#2,很明显,
Parse
从未被执行,因为City
是未初始化的(Rexx中未初始化变量的值是大写的名称)。它是以A=2为条件的,但决不能是这样。一个问题,如果A=2,那么在
IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)
如果是2回路被旁通。我想你的计划应该是:
SOURCEFILE = "C:\DATA\DATA.TXT"
DO COUNTER=1 TO LINES(SOURCEFILE)
PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
CALL SETCURSOR 4,23
CALL CREATEDATA
END
RETURN /* prevent the fall through to createdata */
CREATEDATA:
---------------------------
parse语句具有以下基本格式
解析[源][解析控件]
[来源]在哪里
arg-过程调用的参数
pull—从堆栈中拉出的数据
var-数据来自一个变量
价值以内联方式提供数据
所以你的解析可以像
linein = LINEIN(SOURCEFILE)
PARSE var linein CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
或
最后,ass ross说,您应该尝试保存行(sourcefile),因为它涉及读取整个文件根据Cowlishaw的Rexx书,内置的行函数可能会返回引用文件中的行数,如果无法确定,则返回一个“1”,其中非零计数是合适的,否则返回一个“0”。我在windows上经常使用ooRexx,并且可以确认ooRexx并没有计算所有行,它只返回0/1。我使用以下方法一次读取一行文件:DO WHILE LINES(文件名)>0;使用…解析值行(文件名);end
Lines()
的结果取决于实现。一些实现返回一个计数,其他的只是1或0,正如您所观察到的。这也是为什么我更喜欢向上加载词干:行。0
是一个实际计数。
DO COUNTER=1 TO LINES(SOURCEFILE)
CALL SETCURSOR 4,23
CALL CREATEDATA LINEIN(SOURCEFILE)
END
RETURN /* prevent the fall through to createdata */
CREATEDATA:
parse arg CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc