Java 不同语言字符的正则表达式模式_Java_Regex

Java 不同语言字符的正则表达式模式

java regex

Java 不同语言字符的正则表达式模式,java,regex,Java,Regex,我想为以下字符串编写正则表达式： /en-us/newyork/stores.storelocation.json /es/colecciones/víveres.storelocation.json /es/colección/víveres%C3%ADa.storelocation.json /fr/collections/magasins.storelocation.json /ja/%E5%95%86%E5%93%81%E3%82%AB%E3%83%86%E3%82%B4%E3%83%A

我想为以下字符串编写正则表达式：

/en-us/newyork/stores.storelocation.json
/es/colecciones/víveres.storelocation.json
/es/colección/víveres%C3%ADa.storelocation.json
/fr/collections/magasins.storelocation.json
/ja/%E5%95%86%E5%93%81%E3%82%AB%E3%83%86%E3%82%B4%E3%83%AA%E3%83%BC/%E3%82%B8%E3%83%A5%E3%82%A8%E3%83%AA%E3%83%BC.storelocation.json

我是用英语写的

\/en us\/[a-zA-Z]+\/[a-zA-Z]+.storelocation.json

但问题是它不适用于法语、汉语或俄语等其他语言。如果我用

[\w]

替换

[a-zA-Z]

，那么它会考虑层次结构中的所有字符

字符串的静态部分是“.storelocation.json”，层次结构将保持与“/language/location/stores.storelocation.json”相同

有人能帮我吗。我想要一个正则表达式，它将匹配上面的所有字符串。

而不是

[a-zA-Z]

使用

\p{L}

匹配Unicode字母

您可以使用此正则表达式：

^/\p{L}{2}(?:-\p{L}{2})?/(?:\p{L}|%[A-F\d]{2})+/(?:\p{L}|%[A-F\d]{2})+\.storelocation\.json$

在Java使用中：

final String regex = 
"^/\\p\\{L\\}{2\\}(?:-\\p\\{L\\}{2\\})?/(?:\\p\\{L\\}|%[A-F\\d]{2})+/(?:\\p\\{L\\}|%[A-F\\d]{2})+\\.storelocation\\.json$";

到目前为止你都试过什么？显示你的源代码我只是编辑问题请检查Hi Anubhava，谢谢你的回复。。除了包含中性值的URL（如/ja/%E5%95%86%E5%93%81%E3%82%AB%E3%83%86%E3%82%B4%E3%83%AA%E3%83%BC/%E3%82%B8%E3%83%A5%E3%82%A8%E3%83%AA%E3%83%BC.storelocation.jsonThanks很多Anubhavan模式，我怀疑您为什么要使用“：”因为如果我也删除这个模式，效果会很好。

（？：

是非捕获组。这只会在正则表达式求值时节省一些内存。