Regex 正则表达式-提取特定短语之间的字符_Regex_Sas

Regex 正则表达式-提取特定短语之间的字符

regex sas

Regex 正则表达式-提取特定短语之间的字符,regex,sas,Regex,Sas,我只需要从血液测试文本中提取测量单位，即以下文本中的“K/UL”、“M/UL”和“%”等： WBC 4.27-11.40 k/uL 3.64 (L) RBC 3.90-5.03 m/uL 4.30 Hemoglobin 10.6-13.4 g/dL

我只需要从血液测试文本中提取测量单位，即以下文本中的“K/UL”、“M/UL”和“%”等：

WBC                            4.27-11.40 k/uL                        3.64 (L)
RBC                            3.90-5.03 m/uL                         4.30
Hemoglobin                     10.6-13.4 g/dL                         13.0
Hematocrit                     32.2-39.8 %                            36.1
MCV                            74.4-87.6 fL                           84.0
MCH                            24.8-29.5 pG                           30.2 (H)
MCHC                           31.8-34.9 g/dL                         36.0 (H)
RDW-CV                         12.2-14.4 %                            13.2
Platelet Count                 150-400 k/uL                           175
MPV                            9.2-11.4 fL                            8.6 (L)
Neut%                          28.6-74.5 %                            43.1
Abs Neut (ANC)                 1.63-7.87 k/uL                         1.57 (L)
Lymph%                         15.5-57.8 %                            43.7
Abs Lymph                      0.97-4.28 k/uL                         1.59
Mono%                          4.2-12.3 %                             9.3
Abs Mono                       0.19-0.85 k/uL                         0.34
Eosin%                         0.0-4.7 %                              3.6
Abs Eosin                      0.00-0.52 k/uL                         0.13
Baso%                          0.0-0.7 %                              0.3
Abs Baso                       0.00-0.06 k/uL                         0.01

这意味着我需要识别'-'+number+'+单元来提取

我尝试使用负数look-behind表达式

（？），这意味着如果有一个'-'字符后跟一个浮点数，则只匹配非数字，但它产生零匹配
请注意，我使用的是SAS（即Perl正则表达式）。
使用
\d+-\d[\d.]*\s*\K\S+

看
解释
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  -                        '-'
--------------------------------------------------------------------------------
  \d                       digits (0-9)
--------------------------------------------------------------------------------
  [\d.]*                   any character of: digits (0-9), '.' (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \K                       match reset operator
--------------------------------------------------------------------------------
  \S+                      non-whitespace (all but \n, \r, \t, \f,
                           and " ") (1 or more times (matching the
                           most amount possible))

尝试\d+（？：\。\d+）（？：-\d+）？\s*\K[^\s\d]+
（）。可能会有帮助吗？这样做：\d\s（.*？：（？：\s |$）
，请参阅\d+（？：\。\d+）+（？：-\d+（？：-\d+）（？：\。。\d+）+）+）\s*\K\d*[^\d*。[\s]
\w++/code>？原始血液检测结果文本是否在单个变量的一行中？是否SAS不支持“\K”运算符？@ShakedNave据我所知，SAS使用支持\K
的PCRE。