Sas 在关键字之前和之后拉动字符串
不确定这在SAS中是否可行;虽然我正在慢慢地学习SAS中几乎任何事情都是可能的 我有一个600名患者的数据集,在这个数据集中我有一个注释变量。注释变量包含每个患者对其护理所说的几句话。例如,数据集如下所示:Sas 在关键字之前和之后拉动字符串,sas,Sas,不确定这在SAS中是否可行;虽然我正在慢慢地学习SAS中几乎任何事情都是可能的 我有一个600名患者的数据集,在这个数据集中我有一个注释变量。注释变量包含每个患者对其护理所说的几句话。例如,数据集如下所示: ID Comment 1 Today we have great service. everyone was really nice. 2 The customer service team did not know what they w
ID Comment
1 Today we have great service. everyone was really nice.
2 The customer service team did not know what they were talking about and was rude.
3 Everyone was very helpful 5 stars.
4 Not very helpful at all.
5 Staff was nice.
6 All the people was really nice.
假设我确定了一些我感兴趣的关键词;例如,友善、粗鲁和乐于助人
有没有一种方法可以将这些单词前面的两个字符串拉出来并生成一个频率表
WORD Frequency
Was Really Nice 2
And Was Rude 1
Was Very Helpful 1
Not very helpful 1
我已经编写了一个代码,它将帮助我识别关键字,该代码在comment变量中创建每个单词的频率计数
data PG_2 / view=PG_2;
length word $20;
set PG_1;
do i = 1 by 1 until(missing(word));
word = upcase(scan(COMMENT, i));
if not missing(word) then output;
end;
keep word;
run;
proc freq data=PG_2 order=freq;
table word / out=wordfreq(drop=percent);
run;
data firstTwoStrings;
length firstTwoStrings $200;
retain re;
if _N_ = 1 then
re = prxparse('/(\w+ \w+) nice/'); /*change 'nice' to your desired keyword*/
set comments;
if prxmatch(re, COMMENT) then
do;
firstTwoStrings = prxposn(re, 1, COMMENT);
end;
run;
您看过SAS中的perl正则表达式()函数了吗。我想他们可能会解决你的问题 您可以使用正则表达式捕获组,使用
prxparse
和prxposn
直接提取关键字前面的两个单词。下面应该在comment变量中的nice之前抓取任意两个单词,并将它们添加到firstTwoStrings
变量中
data PG_2 / view=PG_2;
length word $20;
set PG_1;
do i = 1 by 1 until(missing(word));
word = upcase(scan(COMMENT, i));
if not missing(word) then output;
end;
keep word;
run;
proc freq data=PG_2 order=freq;
table word / out=wordfreq(drop=percent);
run;
data firstTwoStrings;
length firstTwoStrings $200;
retain re;
if _N_ = 1 then
re = prxparse('/(\w+ \w+) nice/'); /*change 'nice' to your desired keyword*/
set comments;
if prxmatch(re, COMMENT) then
do;
firstTwoStrings = prxposn(re, 1, COMMENT);
end;
run;
您看过SAS中的perl正则表达式()函数了吗。我想他们可能会解决你的问题 您可以使用正则表达式捕获组,使用
prxparse
和prxposn
直接提取关键字前面的两个单词。下面应该在comment变量中的nice之前抓取任意两个单词,并将它们添加到firstTwoStrings
变量中
data PG_2 / view=PG_2;
length word $20;
set PG_1;
do i = 1 by 1 until(missing(word));
word = upcase(scan(COMMENT, i));
if not missing(word) then output;
end;
keep word;
run;
proc freq data=PG_2 order=freq;
table word / out=wordfreq(drop=percent);
run;
data firstTwoStrings;
length firstTwoStrings $200;
retain re;
if _N_ = 1 then
re = prxparse('/(\w+ \w+) nice/'); /*change 'nice' to your desired keyword*/
set comments;
if prxmatch(re, COMMENT) then
do;
firstTwoStrings = prxposn(re, 1, COMMENT);
end;
run;
或者使用下面的方法来同时搜索所有有趣的单词'/(\W++W+(漂亮粗鲁有帮助)),也可以考虑使用PRXNEXT进行多个点击string@Ben_Corcoran谢谢,这看起来很棒,现在要测试一下!很抱歉没有及时回复-我一发布这篇文章,我就得了胃病,在过去的3天里就被取消了。或者用下面的方法同时搜索所有有趣的单词'/(\W++W+(很好的粗鲁有帮助))。string@Ben_Corcoran谢谢这看起来很棒,现在要测试一下!很抱歉没有尽快回复-我一发布这篇文章就得了胃病,在过去的3天里已经停止工作了。