Java 如果第一次捕获后的组未被捕获,则忽略第一次捕获后的空间
我想捕获按字母顺序指定的日期。可能是以下表格之一Java 如果第一次捕获后的组未被捕获,则忽略第一次捕获后的空间,java,regex,regex-greedy,Java,Regex,Regex Greedy,我想捕获按字母顺序指定的日期。可能是以下表格之一 2013年1月1日 2013年1月1日 1月1日 1月1日 2013年1月1日 一月 此外,它们还会出现在句子中。 比如说 “我们能在一月份下午的某个时候见面吗?” 我在java中使用以下正则表达式 ((?<month>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t?|tember)?|oct(ober)?|nov(ember
- 2013年1月1日
- 2013年1月1日
- 1月1日
- 1月1日
- 2013年1月1日
- 一月
((?<month>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t?|tember)?|oct(ober)?|nov(ember)?|dec(ember)?)((\\s+)?(?<date>\\d+)?(st|nd|rd|th))?(\\s+)?,?(\\s+)?(?<year>(20)\\d\\d)?)
((?<date>\\d+)?(st|nd|rd|th)?\\s+(?<month>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t?|tember)?|oct(ober)?|nov(ember)?|dec(ember)?)(\\s+)?,?(?<year>(19|20)\\d\\d)?)
(?一月(uary)??二月(ruary)??三月(ch)??四月(il)??五月(e)??六月(y)??八月(ust)??九月(t)??十月(ober)??十一月(余烬)??十二月(余烬)?(\\s+)(\\d+)(\\d+)(\\s+)(\\d+)(st+)(\\s+)、(\\s+)、(\\s+)(\\s+)(\\s+)(\\d+)(\\d+)(\\d+)(st+)(\\s+)(\\d+)(\\s+)、(\\s+)(\\s+)(\\d+)(\\d+)(\\d+))(\\d+)(\\d+)(\\d+)(\\
1月(安利)2月(乡村)3月(ch)4月(il)5月(东)6月(东)7月(东)8月(东)9月(东)10月(东)11月(余烬)12月(余烬)20日
捕获正则表达式后,我需要指出标记在字符串中的确切位置
当我查看Matcher.end()返回的索引时,我的表达式似乎也捕获了
一月后的太空。我确实想捕捉像“1月1日”这样的表达,但只有在下一个捕捉组匹配是可能的时候
是否可以修改上面的正则表达式来执行此操作 扩展模式以提高可读性:
(
(?<month>
jan(uary)?
| feb(ruary)?
| mar(ch)?
| apr(il)?
| may
| jun(e)?
| jul(y)?
| aug(ust)?
| sep(t?|tember)?
| oct(ober)?
| nov(ember)?
| dec(ember)?
)
(
(\\s+)?
(?<date>\\d+)?
(st|nd|rd|th)
)?
(\\s+)?
,?
(\\s+)?
(?<year>(20)\\d\\d)?
)
将其与第二种模式相结合:
\\b
(
(?<month1>
jan(uary)?
| feb(ruary)?
| mar(ch)?
| apr(il)?
| may
| jun(e)?
| jul(y)?
| aug(ust)?
| sep(t|tember)?
| oct(ober)?
| nov(ember)?
| dec(ember)?
)
(
\\s*
(?<date1>\\d+)
(st|nd|rd|th)?
)?
|
(?<date2>\\d+)
(st|nd|rd|th)?
\\s*
(?<month2>
jan(uary)?
| feb(ruary)?
| mar(ch)?
| apr(il)?
| may
| jun(e)?
| jul(y)?
| aug(ust)?
| sep(t|tember)?
| oct(ober)?
| nov(ember)?
| dec(ember)?
)
)
(
\\s*
,?
\\s*
(?<year>(19|20)\\d\\d)
)?
\\b
\\b
(
(?
一月(八月)?
|二月(乡村)?
|三月(日)?
|四月(日)?
|五月
|六月(东)?
|七月?
|八月(美国东部)?
|九月(星期四)?
|十月?
|十一月(余烬)?
|十二月(余烬)?
)
(
\\*
(?\\d+)
(圣德街)?
)?
|
(?\\d+)
(圣德街)?
\\*
(?
一月(八月)?
|二月(乡村)?
|三月(日)?
|四月(日)?
|五月
|六月(东)?
|七月?
|八月(美国东部)?
|九月(星期四)?
|十月?
|十一月(余烬)?
|十二月(余烬)?
)
)
(
\\*
,?
\\*
(?(19 | 20)\\d\\d)
)?
\\b
作为一行:
\\b(?<month>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t|tember)?|oct(ober)?|nov(ember)?|dec(ember)?)(\\s*(?<date>\\d+)(st|nd|rd|th)?)?(\\s*,?\\s*(?<year>(19|20)\\d\\d))?\\b
\\b((?<month1>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t|tember)?|oct(ober)?|nov(ember)?|dec(ember)?)(\\s*(?<date1>\\d+)(st|nd|rd|th)?)?|(?<date2>\\d+)(st|nd|rd|th)?\\s*(?<month2>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t|tember)?|oct(ober)?|nov(ember)?|dec(ember)?))(\\s*,?\\s*(?<year>(19|20)\\d\\d))?\\b
[代码><代码><代码>>\\b((?jan(uary)月(uary)月(uary)月(uary)月(jan(uary)月(jan(uary)月(uary)月(uary)2月(2月)2月(ruary)2月(ruary)月(农村)月(ch)月(il)月(il)5月(Ma月)月(jun)6月(jun)月(e)月(e)日)月(e)月(e)日)月(e)日)月(e)月(e)月(e)月(e)月(e)日)月(e)日)月(e)日)日)月(e)日)日)月(e)月(y)月(y)月(y)月(y)日)日)月(y)月(y)月(y)日)月(y)月(y)月(y)月(y)月(五月|六月|东|七月|年|八月|东|东|东|东|东|东|东|东|七月|年|东|东|东|东|东|东|东|东124
另一个版本:
static private String month = "(?<month>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t|tember)?|oct(ober)?|nov(ember)?|dec(ember)?)";
static private String suffix = "(?:st|nd|rd|th)";
static private String date = "(?<date>\\d{1,2})";
static private String year = "(?<year>\\d{4})";
// A month name (optionally followed by space followed by a date (optionally
// followed by a suffix or space and a comma) (optionally followed by space
// followed by a year))
static private String order1 = String.format(
"%s(?:\\s+%s(?:%s|\\s+,)?(?:\\s+%s)?)?", month, date, suffix,
year);
// A date followed by a suffix followed by a month (optionally followed by
// space and a comma) optionally followed by space and a year
static private String order2 = String.format(
"%s%s\\s+%s(?:\\s+,)?(?:\\s+%s)?", date, suffix, month, year);
或者,作为不带Java的正则表达式(格式的输出)
:
(一月:八月)二月?三月?四月?日?五月?六月?七月?八月?月?日?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?月?(?:\s+(?\d{1,2})月?
月、二月、三月、四月、六月、七月、八月、九月、十月、十月、十月、十二月、十一月、十二月、十二月、四月、六月、七月、七月、八月、八月、九月?
乍一看,你似乎做了正确的事情。您将整个内容(包括(\\s+))包含在另一个括号中,后面跟着?),这样,如果不匹配,(\\s+)也不会被吸入。但是,对于和之间的逗号,您没有做相同的操作。尝试在(\\s+),?周围加上一组括号?,?和(\\s+)(?…)[在新的括号后面加上“?”),看看这是否解决了您的问题。如果没有,我将试着更仔细地查看。为了清楚起见,我指的是更改(\\s+)?,?移动到(\\s+?),)?,即移动最后一个?在您添加的新组之外。但我注意到,即使只指定了月份,空间也被占用了。就像“简”一样,空间被吸进去了,我收回了。我想,一旦我纠正了所有的空间表达式,使其成为一个组的一部分,它就会起作用。谢谢ajb:)我还想到了一件事:你不应该使用像(\\s+)?
这样的模式。可选地匹配一个或多个空格字符的。但是匹配零个或多个空格字符的\\s*
,是等效的,并且更易于阅读。唯一的区别是,如果存在零空格,并且您使用匹配器查询组,则(\\s+)?
和(\\s*)
返回不同的内容。group()
或start()
或end()
,但对于大多数情况,这不应该是相关的。谢谢您的回答。我不能在代码中这样做,因为我必须将它们保持在配置中。但感谢您花时间指出注意事项。@user1411335我添加了不带Java代码的正则表达式,以防对您有所帮助。此外,我还修复了代码块——它们没有正确显示。
static private String month = "(?<month>jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(t|tember)?|oct(ober)?|nov(ember)?|dec(ember)?)";
static private String suffix = "(?:st|nd|rd|th)";
static private String date = "(?<date>\\d{1,2})";
static private String year = "(?<year>\\d{4})";
// A month name (optionally followed by space followed by a date (optionally
// followed by a suffix or space and a comma) (optionally followed by space
// followed by a year))
static private String order1 = String.format(
"%s(?:\\s+%s(?:%s|\\s+,)?(?:\\s+%s)?)?", month, date, suffix,
year);
// A date followed by a suffix followed by a month (optionally followed by
// space and a comma) optionally followed by space and a year
static private String order2 = String.format(
"%s%s\\s+%s(?:\\s+,)?(?:\\s+%s)?", date, suffix, month, year);
static private String month = "(?<month>jan(?:uary)?|feb(?:ruary)?|mar(?:ch)?|apr(?:il)?|may|jun(?:e)?|jul(?:y)?|aug(?:ust)?|sep(?:t|tember)?|oct(?:ober)?|nov(?:ember)?|dec(?:ember)?)";
static private String suffix = "(?:st|nd|rd|th)";
static private String date = "(?<date>\\d{1,2})";
static private String year = "(?<year>\\d{4})";
// A month name (optionally followed by space followed by a date (optionally
// followed by a suffix)(optionally followed by a comma, possibly with space
// before it)(optionally followed by space followed
// by a year))
static private String v1 = String.format(
"%s(?:\\s+%s%s?(?:\\s*,)?(?:\\s+%s)?)?", month, date, suffix, year);
// A date (optionally followed by a suffix) followed by space followed by a
// month (optionally followed by
// a comma, possibly with space before it) optionally followed by space and
// a year
static private String v2 = String.format(
"%s%s?\\s+%s(?:\\s*,)?(?:\\s+%s)?", date, suffix, month, year);
(?<month>jan(?:uary)?|feb(?:ruary)?|mar(?:ch)?|apr(?:il)?|may|jun(?:e)?|jul(?:y)?|aug(?:ust)?|sep(?:t|tember)?|oct(?:ober)?|nov(?:ember)?|dec(?:ember)?)(?:\s+(?<date>\d{1,2})(?:st|nd|rd|th)?(?:\s*,)?(?:\s+(?<year>\d{4}))?)?
(?<date>\d{1,2})(?:st|nd|rd|th)?\s+(?<month>jan(?:uary)?|feb(?:ruary)?|mar(?:ch)?|apr(?:il)?|may|jun(?:e)?|jul(?:y)?|aug(?:ust)?|sep(?:t|tember)?|oct(?:ober)?|nov(?:ember)?|dec(?:ember)?)(?:\s*,)?(?:\s+(?<year>\d{4}))?