String split token=java和正则表达式中的值
我有这根绳子String split token=java和正则表达式中的值,java,regex,split,Java,Regex,Split,我有这根绳子 token1=value1Token2=value2Token3[12]=value3 其中,tokenX可能是带有数字的字符串(例如:myToken12或my2Token) 而valueX只是数字或符号(例如:123123或{{1,2},3,4}) 我想转换成这个数组: ['token1=value1', 'token2=value2', 'token3[12]=value3'] 我可能拥有的字符串示例: String s = na23me=12341234las4tName
token1=value1Token2=value2Token3[12]=value3
其中,tokenX
可能是带有数字的字符串(例如:myToken12或my2Token)
而valueX
只是数字或符号(例如:123123或{{1,2},3,4}
)
我想转换成这个数组:
['token1=value1', 'token2=value2', 'token3[12]=value3']
我可能拥有的字符串示例:
String s = na23me=12341234las4tName={{0,0,0},{0,0,0},{0,0,0}}stree2t[696]=764545457OK
我试过用split和matcher
这个问题已经发布了,但这是不同的,因为它更一般(token=value),前面的问题中的值只是一个数字或不同的帖子符号。
我想在这里有一个大致的答案
谢谢
更多详细信息:
At the end just managing the "STRING" as "value", but it shouldn't be a big issue as is into the double quote.
['conf=0', 'ticket[0,9]="TEST"', 'config={0,0,0}', 'platform_id=121212']
getTokens("conf=0ticket[0,9]=\"TEST\"config={0,0,0}platform_id=121212");
=> <conf=0>
=> <ticket[0,9]="TEST">
=> <config={0,0,0}>
=> <platform_id=121212>
getTokens("na23me=12341234las4tName={{0,0,0},{0,0,0},{0,0,0}}stree2t[696]=764545457OK");
=> <na23me=12341234>
=> <las4tName={{0,0,0},{0,0,0},{0,0,0}}>
=> <stree2t[696]=764545457>
=> <OK>
getTokens("na23me=12341234las4tName=654567stree2t[696]=764545457OK");
=> <na23me=12341234>
=> <las4tName=654567>
=> <stree2t[696]=764545457>
=> <OK>
使用此字符串:
String s = na23me=12341234las4tName=654567stree2t[696]=764545457OK
此解决方案:
String[] tokens = s.split("(?<==\\d{1,1000})(?=[a-zA-Z])");
我试过这个
String[] tokens = buffer.split("(?<==\\d{1,1000}|\\W| **\"\\w\"**)(?=[a-zA-Z])");
不起作用。想法?看起来您的输入字符串越来越复杂。下面是一个似乎适用于您提供的所有输入的正则表达式:
void getTokens(String s) {
String[] toks = s.split( "(?<==(?>\"[^\"=]{1,1000}\"|\\P{L}{1,1000})) *(?=\\p{L})" );
for (String tok: toks)
System.out.printf("=> <%s>%n", tok);
}
void getTokens(字符串s){
字符串[]toks=s.split((?\“[^\”=]{11000}\“\\\P{L}{11000}))*(?=\\P{L})”;
用于(串tok:toks)
System.out.printf(“=>%n”,tok);
}
测试:
At the end just managing the "STRING" as "value", but it shouldn't be a big issue as is into the double quote.
['conf=0', 'ticket[0,9]="TEST"', 'config={0,0,0}', 'platform_id=121212']
getTokens("conf=0ticket[0,9]=\"TEST\"config={0,0,0}platform_id=121212");
=> <conf=0>
=> <ticket[0,9]="TEST">
=> <config={0,0,0}>
=> <platform_id=121212>
getTokens("na23me=12341234las4tName={{0,0,0},{0,0,0},{0,0,0}}stree2t[696]=764545457OK");
=> <na23me=12341234>
=> <las4tName={{0,0,0},{0,0,0},{0,0,0}}>
=> <stree2t[696]=764545457>
=> <OK>
getTokens("na23me=12341234las4tName=654567stree2t[696]=764545457OK");
=> <na23me=12341234>
=> <las4tName=654567>
=> <stree2t[696]=764545457>
=> <OK>
getTokens(“conf=0ticket[0,9]=“TEST\”config={0,0,0}平台\u id=121212”);
=>
=>
=>
=>
getTokens(“na23me=12341234las4tName={{0,0,0},{0,0,0},{0,0,0}}stree2t[696]=76457ok”);
=>
=>
=>
=>
getTokens(“na23me=12341234las4tName=654567stree2t[696]=76457ok”);
=>
=>
=>
=>
说明:
At the end just managing the "STRING" as "value", but it shouldn't be a big issue as is into the double quote.
['conf=0', 'ticket[0,9]="TEST"', 'config={0,0,0}', 'platform_id=121212']
getTokens("conf=0ticket[0,9]=\"TEST\"config={0,0,0}platform_id=121212");
=> <conf=0>
=> <ticket[0,9]="TEST">
=> <config={0,0,0}>
=> <platform_id=121212>
getTokens("na23me=12341234las4tName={{0,0,0},{0,0,0},{0,0,0}}stree2t[696]=764545457OK");
=> <na23me=12341234>
=> <las4tName={{0,0,0},{0,0,0},{0,0,0}}>
=> <stree2t[696]=764545457>
=> <OK>
getTokens("na23me=12341234las4tName=654567stree2t[696]=764545457OK");
=> <na23me=12341234>
=> <las4tName=654567>
=> <stree2t[696]=764545457>
=> <OK>
正则表达式使用前向和后向进行拆分:
是一种正向查找,确保当前位置前面有一个(?\“[^\”=]{11000}\“\\\\P{L}{11000}))
,后面有以下之一:=
- 长度或长度最大为1000的双引号字符串
- 1到1000个非unicode字母
称为(?>foo | bar)
是一个积极的前瞻,确保当前位置后面有一个unicode字母(?=\\p{L})
(?i)(?因此,示例输入是:token1=111Token2=222Token3[12]=33my2Token=444
?@Kasper在您的示例中,来自前面的几个例子token1={{0,0,1},2,2}Token2={{0,0,1},2,2}令牌3[12]={0,0,1},2,2}my2Token={0,0,1},2,2}
您在{0,0,1},2,2}令牌3之间有空格
但在您对问题的描述中,我没有看到value1Token2
之间有空格。那么,您的数据描述或示例是错误的?