Regex 正则表达式-提取特定短语之间的字符

Regex 正则表达式-提取特定短语之间的字符,regex,sas,Regex,Sas,我只需要从血液测试文本中提取测量单位,即以下文本中的“K/UL”、“M/UL”和“%”等: WBC 4.27-11.40 k/uL 3.64 (L) RBC 3.90-5.03 m/uL 4.30 Hemoglobin 10.6-13.4 g/dL

我只需要从血液测试文本中提取测量单位,即以下文本中的“K/UL”、“M/UL”和“%”等:

WBC                            4.27-11.40 k/uL                        3.64 (L)
RBC                            3.90-5.03 m/uL                         4.30
Hemoglobin                     10.6-13.4 g/dL                         13.0
Hematocrit                     32.2-39.8 %                            36.1
MCV                            74.4-87.6 fL                           84.0
MCH                            24.8-29.5 pG                           30.2 (H)
MCHC                           31.8-34.9 g/dL                         36.0 (H)
RDW-CV                         12.2-14.4 %                            13.2
Platelet Count                 150-400 k/uL                           175
MPV                            9.2-11.4 fL                            8.6 (L)
Neut%                          28.6-74.5 %                            43.1
Abs Neut (ANC)                 1.63-7.87 k/uL                         1.57 (L)
Lymph%                         15.5-57.8 %                            43.7
Abs Lymph                      0.97-4.28 k/uL                         1.59
Mono%                          4.2-12.3 %                             9.3
Abs Mono                       0.19-0.85 k/uL                         0.34
Eosin%                         0.0-4.7 %                              3.6
Abs Eosin                      0.00-0.52 k/uL                         0.13
Baso%                          0.0-0.7 %                              0.3
Abs Baso                       0.00-0.06 k/uL                         0.01
这意味着我需要识别'-'+number+'+单元来提取

我尝试使用负数look-behind表达式
(?),这意味着如果有一个'-'字符后跟一个浮点数,则只匹配非数字,但它产生零匹配

请注意,我使用的是SAS(即Perl正则表达式)。

使用

\d+-\d[\d.]*\s*\K\S+

解释

--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  -                        '-'
--------------------------------------------------------------------------------
  \d                       digits (0-9)
--------------------------------------------------------------------------------
  [\d.]*                   any character of: digits (0-9), '.' (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \K                       match reset operator
--------------------------------------------------------------------------------
  \S+                      non-whitespace (all but \n, \r, \t, \f,
                           and " ") (1 or more times (matching the
                           most amount possible))

尝试
\d+(?:\。\d+)(?:-\d+)?\s*\K[^\s\d]+
()。可能会有帮助吗?这样做:
\d\s(.*?:(?:\s |$)
,请参阅
\d+(?:\。\d+)+(?:-\d+(?:-\d+)(?:\。。\d+)+)+)\s*\K\d*[^\d*。[\s]
\w++/code>?原始血液检测结果文本是否在单个变量的一行中?是否SAS不支持“\K”运算符?@ShakedNave据我所知,SAS使用支持
\K
的PCRE。