使用点或逗号作为分隔符表示带或不带小数的数字的Python正则表达式？_Python_Regex

使用点或逗号作为分隔符表示带或不带小数的数字的Python正则表达式？

python regex

使用点或逗号作为分隔符表示带或不带小数的数字的Python正则表达式？,python,regex,Python,Regex,我只是在学习正则表达式，现在我正在尝试匹配一个或多或少代表这一点的数字： [zero or more numbers][possibly a dot or comma][zero or more numbers] 没有点或逗号也可以。因此，它应符合以下条件： 1 123 123. 123.4 123.456 .456 123, # From here it's the same but with commas instead of dot separators 123,4 123,456 ,

我只是在学习正则表达式，现在我正在尝试匹配一个或多或少代表这一点的数字：

[zero or more numbers][possibly a dot or comma][zero or more numbers]

没有点或逗号也可以。因此，它应符合以下条件：

1
123
123.
123.4
123.456
.456
123,  # From here it's the same but with commas instead of dot separators
123,4
123,456
,456

0.,1
0a,1
0..1
1.1.2
100,000.99  # I know this and the one below are valid in many languages, but I simply want to reject these
100.000,99

^((\d){1,3},*){1,5}\.(\d){2}$

但它不应与以下内容相匹配：

1
123
123.
123.4
123.456
.456
123,  # From here it's the same but with commas instead of dot separators
123,4
123,456
,456

0.,1
0a,1
0..1
1.1.2
100,000.99  # I know this and the one below are valid in many languages, but I simply want to reject these
100.000,99

^((\d){1,3},*){1,5}\.(\d){2}$

到目前为止，我已经提出了

[0-9]*[，][0-9]*

，但似乎效果不太好：

>>> import re
>>> r = re.compile("[0-9]*[.,][0-9]*")
>>> if r.match('0.1.'): print 'it matches!'
...
it matches!
>>> if r.match('0.abc'): print 'it matches!'
...
it matches!

我觉得我做错了两件事：我没有正确使用match，我的正则表达式也不正确。有人能告诉我我做错了什么吗？欢迎所有提示

您需要在该字符类之后添加

？

，将

[，]

部分作为可选部分，并且不要忘记添加锚定

断言我们在开始，而

断言我们在结束

^\d*[.,]?\d*$

如果您不想允许使用单个逗号或点，请使用前瞻

^(?=.*?\d)\d*[.,]?\d*$

那么：

(?:^|[^\d,.])\d*(?:[,.]\d+)?(?:$|[^\d,.])

如果不需要空字符串：

(?:^|[^\d,.])\d+(?:[,.]\d+)?(?:$|[^\d,.])

问题是，您要求的是部分匹配，只要它从一开始就开始

一种解决方法是在

\Z

中结束正则表达式（可选

）

\Z

仅在字符串末尾匹配

另一种方法是使用

re.fullmatch

import re
help(re.match)
#>>> Help on function match in module re:
#>>>
#>>> match(pattern, string, flags=0)
#>>>     Try to apply the pattern at the start of the string, returning
#>>>     a match object, or None if no match was found.
#>>>

请注意，

fullmatch

在3.4中是新的

您还应该使

[，]

部分成为可选部分，因此在该部分后面附加一个

？

“？”

使生成的RE与前面RE的0或1次重复匹配。ab？将匹配“a”或“ab”

例如

试试这个。验证所有案例。请参阅演示

验证非空匹配的一些想法：

1.）使用a检查至少一个数字：

^(?=.?\d)\d*[.,]?\d*$

从
```
^
```
到
```
$
```
```
（？=.？\d）
```
匹配如果
```
，1
```
，
```
1
```
```
\d*[，]？\d*
```
允许的顺序：
```
\d*
```
任意数量的数字，后跟一个
```
[，]
```
，
```
\d*
```
请注意，前瞻中的第一个
是代表任何字符的，而
```
[，]
```
中的另一个匹配文本

除了正向前瞻，还可以使用反向前瞻：

^（？！\D*$）\D*[，]？\D*$

2.）使用两种不同的模式：

^(?:\d+[.,]\d*|[.,]?\d+)$

```
（？：
```
为替换启动一个
```
\d+[，]\d*
```
用于匹配
```
1.
```
，
```
1,1
```
，…
```
|
```
或
```
[，]？\d+
```
用于匹配
```
1
```
，
```
，1
```

如果您只需在前面添加^并在后面添加$，那么您的正则表达式就可以正常工作，这样系统就可以知道字符串的开始和结束方式

试试这个

^[0-9]*[.,]{0,1}[0-9]*$

import re

checklist = ['1', '123', '123.', '123.4', '123.456', '.456', '123,', '123,4', '123,456', ',456', '0.,1', '0a,1', '0..1', '1.1.2', '100,000.99', '100.000,99', '0.1.', '0.abc']

pat = re.compile(r'^[0-9]*[.,]{0,1}[0-9]*$')

for c in checklist:
   if pat.match(c):
      print '%s : it matches' % (c)
   else:
      print '%s : it does not match' % (c)

1 : it matches
123 : it matches
123. : it matches
123.4 : it matches
123.456 : it matches
.456 : it matches
123, : it matches
123,4 : it matches
123,456 : it matches
,456 : it matches
0.,1 : it does not match
0a,1 : it does not match
0..1 : it does not match
1.1.2 : it does not match
100,000.99 : it does not match
100.000,99 : it does not match
0.1. : it does not match
0.abc : it does not match

如果小数点后两位为必填项，则可以使用以下内容：

1
123
123.
123.4
123.456
.456
123,  # From here it's the same but with commas instead of dot separators
123,4
123,456
,456

0.,1
0a,1
0..1
1.1.2
100,000.99  # I know this and the one below are valid in many languages, but I simply want to reject these
100.000,99

^((\d){1,3},*){1,5}\.(\d){2}$

这将匹配以下模式：

1.00
10点
100.00
1000.00
10000.00
100000.00
1000000.00

更通用的方法如下

import re
r=re.compile(r"^\d\d*[,]?\d*[,]?\d*[.,]?\d*\d$")
print(bool(r.match('100,000.00')))

这将匹配以下模式：

一百
一千
100.00
1000.00
10万
100000.00

这将与以下模式不匹配：

.100

..100

100.100.00

，100

一百,

一百

您的允许使用单个
或
，
：p，但不知道这种情况是否应该验证。：）Thieeef！加上工作的那个。当然，稍加解释就好了。@Jonny5得到2的那一个也有这个缺陷。你实际上是最好的，应该是d。但是这个世界是不公平的：（
[可能是点或逗号]的意思是什么？）
？如果您有任何疑问，请向op咨询，以获得澄清。op在不匹配列表中没有提到逗号或点。我已经告诉过您，如果您有任何疑问，请向op咨询。很高兴在您的回答中提到这种情况：）这是匹配的
12，，，，，，，，，，，，.34
这是匹配的
1，，.2
是的。。需要防止重复出现逗号或圆点等情况。我正在对提取的（使用ocr）文本进行匹配以进行验证。因为在OCR文本中出现这些情况的可能性非常小，所以这对我来说很有效。这对于用户输入的验证并不完美。感谢@Toto提供您的反馈