Loops SAS:在文件夹中循环以导入和导出多个文件
提前感谢您提出的所有建议 我第一次在SAS工作是为了完成一项(理论上)简单的任务。我在Windows目录中有一个父文件夹,其中包含多个子文件夹。子文件夹没有系统地命名。例如,如果父文件夹名为“W:/Documents/ParentFolder/”,则子文件夹可能是“W:/Documents/ParentFolder/ABC1D26/”和“W:/Documents/ParentFolder/HG34A/” 每个子文件夹包含多个SAS数据集。在任何特定的子文件夹中,一些SAS数据集具有.sas7bdat扩展名,而其他数据集具有.sd2扩展名。此外,没有两个子文件夹必须具有相同数量的数据集,并且这些数据集也没有系统地命名 我想在SAS中编写一个程序,查看每个子文件夹,加载找到的任何.sas7bdat或.sd2数据集,并将数据集作为.dta文件导出到另一个文件夹中 每个子文件夹中有太多SAS数据集,无法为每个数据集手动执行此任务,但子文件夹不太多,我无法手动将子文件夹名称提供给SAS。下面是我在完成此任务的程序中尝试的注释版本。不幸的是,我遇到了许多错误,毫无疑问是由于我缺乏SAS的经验 例如,SAS会给出以下错误:“错误:无效的逻辑名称;”“错误:文件名语句中的错误;”和“错误:无效的DO循环控制信息;” 有人能提供一些建议吗Loops SAS:在文件夹中循环以导入和导出多个文件,loops,directory,sas,Loops,Directory,Sas,提前感谢您提出的所有建议 我第一次在SAS工作是为了完成一项(理论上)简单的任务。我在Windows目录中有一个父文件夹,其中包含多个子文件夹。子文件夹没有系统地命名。例如,如果父文件夹名为“W:/Documents/ParentFolder/”,则子文件夹可能是“W:/Documents/ParentFolder/ABC1D26/”和“W:/Documents/ParentFolder/HG34A/” 每个子文件夹包含多个SAS数据集。在任何特定的子文件夹中,一些SAS数据集具有.sas7bd
%macro sas_file_converter();
/* List the sub-folders containing SAS files in the parent folder */
%let folder1 = W:\Documents\ParentFolder\ABC1D26;
%let folder2 = W:\Documents\ParentFolder\HG34A;
/* Start loop over the sub-folders. In each sub-folder, identify all the files, extract the file names, import the files, and export the files. */
%do folder_iter = 1 %to 2;
/* Define the sub-folder that is the focus of this iteration of the loop */
filename workingFolder "&&folder&folder_iter..";
/* Extract a list of datasets in this sub-folder */
data datasetlist;
length Line 8 dataset_name $300;
List = dopen('workingFolder');
do Line = 1 to dnum(List);
dataset_name = tranwrd(tranwrd(lowcase(trim(dread(List,Line))),".sas7bdat",""),".sd2","");
output;
end;
drop List Line;
run;
/* Get number of datasets in this sub-folder */
proc sql nprint;
select count(*)
into :datasetCount
from WORK.datasetlist;
quit;
/* Loop over datasets in the sub-folder. In each iteration of the loop, load the dataset and export the dataset. */
%do dataset_iter = 1 %to &datasetCount.;
/* Get the name of the dataset which is the focus of this iteration */
data _NULL_;
set WORK.DATASETLIST (firstobs=&dataset_iter. obs=&dataset_iter.);
call symput("inMember",strip(dataset_name));
end;
/* Set the libname */
LIBNAME library '&folder&folder_iter..';
/* Load the dataset */
data new;
set library.&inMember.;
run;
/* Export the dataset */
proc export data=library.&inMember.
file = "W:\Documents\OutputFolder\&inMember..dta"
dbms = stata replace;
run;
%end;
%end;
%mend;
您可以在数据步骤中执行所有代码生成,并通过
CALL EXECUTE
提交。程序中唯一与宏相关的部分是指定sas数据根文件夹、要搜索的子文件夹的名称和导出路径
该程序可以是非常类似的宏编码,但调试起来可能更困难,并且需要围绕函数调用使用%sysfunc
包装器
例如:
/* Create some sample data in some example folders */
%let workpath = %sysfunc(pathname(WORK));
%let name = %sysfunc(dcreate(ABC, &workpath));
%let name = %sysfunc(dcreate(DEF, &workpath));
libname user "&workpath./ABC";
data one two three four five;
set sashelp.class;
run;
libname user "&workpath./DEF";
data six seven eight nine ten;
set sashelp.class;
run;
libname user clear;
/* export all data sets in folders to liked named export files */
%let dataroot = &workpath;
%let folders = ABC DEF;
%let exportpath = c:\temp;
data _null_;
do findex = 1 to countw("&folders");
folder = scan("&folders", findex);
path = catx("/", "&dataroot.", folder);
call execute ('libname user ' || quote(trim(path)) || ';');
length fileref $8;
call missing(fileref);
rc = filename(fileref, path);
did = dopen(fileref);
do dindex = 1 to dnum(did);
filename = dread(did,dindex);
if scan(filename,-1) ne 'sas7bdat' then continue;
xptfilename = tranwrd(filename, '.sas7bdat', '.dta');
xptfilepath = catx('/', "&exportpath", xptfilename);
datasetname = tranwrd(filename, '.sas7bdat', '');
sascode = 'PROC EXPORT data=' || trim(datasetname)
|| " replace file=" || quote(trim(xptfilepath))
|| " dbms=stata;"
;
call execute (trim(sascode));
end;
did = dclose(did);
call execute ('run; libname user clear;');
rc = filename(fileref);
end;
run;
非常感谢你的建议。我使用以下程序来执行此任务。这主要是基于理查德的例子。我把它贴在这里是为了未来读者的利益;Richard的示例包括其他代码,可以帮助您理解此程序的功能 通过将其他文件/文件夹添加到“%let folders”行,可以容纳这些文件/文件夹。(我在这里写了许多文件/文件夹名。) 注意,我用三个破折号(“--”)分隔子文件夹,因为一些文件和子文件夹的名称中有空格。还要注意的是,对于.sd2文件,我可以简单地将“sas7bdat”的实例替换为“sd2”,程序运行良好 再次感谢
%let inputfolder = W:\Documents\ParentFolder;
%let folders = ABC1D26---HG34A---Sub Folder\ZH323;
%let exportfolder = W:\Documents\ExportFolder;
data _null_;
do findex = 1 to countw("&folders.","---");
folder = scan("&folders", findex, "---");
path = catx("/", "&dataroot.", folder);
call execute ('libname user ' || quote(trim(path)) || ';');
length fileref $8;
call missing(fileref);
rc = filename(fileref, path);
did = dopen(fileref);
do dindex = 1 to dnum(did);
filename = lowcase(dread(did,dindex));
if scan(filename,-1) ne 'sas7bdat' then continue;
xptfilename = tranwrd(filename, '.sas7bdat', '.dta')
xptfilepath = catx("\", "&exportpath", folder, xptfilename);
datasetname = tranwrd(filename, '.sas7bdat', '');
sascode = 'PROC EXPORT data=' || trim(datasetname) || " replace file=" || quote(trim(xptfilepath)) || " dbms=stata; run;";
call execute (trim(sascode));
end;
did = dclose(did);
call execute ('libname user clear;');
rc = filename(fileref);
end;
run;
您是否将DTA文件全部填充到同一个输出文件夹中?如果同一成员存在于多个源文件夹中而导致名称冲突,该怎么办?如果没有名称冲突,那么您只需要定义一个指向所有输入文件夹的libref。首先,您有一个查找所有文件夹/文件的步骤,这非常简单。你仔细检查这个文件,找出是否有重复的,找到名字。然后,这是一个简单的导出,我可能会通过call EXECUTE调用它,而不是通过循环宏调用它-在单独测试每个组件时,更容易进行调试和测试。初始文件列表代码从SAS文档-宏附录中窃取。我只是想插嘴:我真的很喜欢你的代码-非常干净、易于阅读并且格式良好。在这里颠倒函数调用的顺序
trim(quote(xptfilepath))
否则尾随空格就在引号内。好的catch@Tomquote(trim(
使codegen更紧