Regex 正则表达式-提取特定短语之间的字符
我只需要从血液测试文本中提取测量单位,即以下文本中的“K/UL”、“M/UL”和“%”等:Regex 正则表达式-提取特定短语之间的字符,regex,sas,Regex,Sas,我只需要从血液测试文本中提取测量单位,即以下文本中的“K/UL”、“M/UL”和“%”等: WBC 4.27-11.40 k/uL 3.64 (L) RBC 3.90-5.03 m/uL 4.30 Hemoglobin 10.6-13.4 g/dL
WBC 4.27-11.40 k/uL 3.64 (L)
RBC 3.90-5.03 m/uL 4.30
Hemoglobin 10.6-13.4 g/dL 13.0
Hematocrit 32.2-39.8 % 36.1
MCV 74.4-87.6 fL 84.0
MCH 24.8-29.5 pG 30.2 (H)
MCHC 31.8-34.9 g/dL 36.0 (H)
RDW-CV 12.2-14.4 % 13.2
Platelet Count 150-400 k/uL 175
MPV 9.2-11.4 fL 8.6 (L)
Neut% 28.6-74.5 % 43.1
Abs Neut (ANC) 1.63-7.87 k/uL 1.57 (L)
Lymph% 15.5-57.8 % 43.7
Abs Lymph 0.97-4.28 k/uL 1.59
Mono% 4.2-12.3 % 9.3
Abs Mono 0.19-0.85 k/uL 0.34
Eosin% 0.0-4.7 % 3.6
Abs Eosin 0.00-0.52 k/uL 0.13
Baso% 0.0-0.7 % 0.3
Abs Baso 0.00-0.06 k/uL 0.01
这意味着我需要识别'-'+number+'+单元来提取
我尝试使用负数look-behind表达式(?),这意味着如果有一个'-'字符后跟一个浮点数,则只匹配非数字,但它产生零匹配
请注意,我使用的是SAS(即Perl正则表达式)。使用
\d+-\d[\d.]*\s*\K\S+
看
解释
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
- '-'
--------------------------------------------------------------------------------
\d digits (0-9)
--------------------------------------------------------------------------------
[\d.]* any character of: digits (0-9), '.' (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\K match reset operator
--------------------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
尝试\d+(?:\。\d+)(?:-\d+)?\s*\K[^\s\d]+
()。可能会有帮助吗?这样做:\d\s(.*?:(?:\s |$)
,请参阅\d+(?:\。\d+)+(?:-\d+(?:-\d+)(?:\。。\d+)+)+)\s*\K\d*[^\d*。[\s]
\w++/code>?原始血液检测结果文本是否在单个变量的一行中?是否SAS不支持“\K”运算符?@ShakedNave据我所知,SAS使用支持\K
的PCRE。