Python 如何正确分组字符串中的项目?
我目前有一组字符串,如下所示:Python 如何正确分组字符串中的项目?,python,string,list,Python,String,List,我目前有一组字符串,如下所示: [58729 58708] [58729] [58708] [58729] ['58729', '58708'] ['58729'] ['58708'] ['58729'] 我需要将它们转换为列表,但当我使用list()时,我得到: 我如何对它们进行分组,使它们不会分离为单个字符?比如说: [58729 58708] [58729] [58708] [58729] ['58729', '58708'] ['58729'] ['58708
[58729 58708]
[58729]
[58708]
[58729]
['58729', '58708']
['58729']
['58708']
['58729']
我需要将它们转换为列表,但当我使用list()时,我得到:
我如何对它们进行分组,使它们不会分离为单个字符?比如说:
[58729 58708]
[58729]
[58708]
[58729]
['58729', '58708']
['58729']
['58708']
['58729']
假设您的输入字符串被分配给变量foo
foo = '[58729 58708]'
首先,您希望使用列表切片消除字符串开头和结尾的括号:
foo = foo[1:-1]
现在,您可以使用string方法split()将字符串转换为列表。这里,split()的输入是列表拆分的字符。在您的情况下,这将是一个空格字符:
foo.split(' ')
这是回报
['58729', '58708'].
使用“ast”模块的示例
import ast
data_str = '[58729 58708]'
data_str = data_str.replace(' ',',') # make it '[58729, 58708]'
x = ast.literal_eval(data_str)
print(x)
Out[1]:
[58729, 58708]
print(x[0])
Out[2]:
58729
print(type(x))
Out[3]:
<class 'list'>
# and after all if you want exactly list of string:
[str(s) for s in x]
Out[4]:
['58729', '58708']
导入ast
数据_str='[58729 58708]'
data_str=data_str.replace('','')#使其成为'[5872958708]'
x=ast.literal\u eval(数据\u str)
打印(x)
出[1]:
[58729, 58708]
打印(x[0])
出[2]:
58729
打印(类型(x))
出[3]:
#毕竟,如果您想要一个字符串列表:
[x中s的str(s)]
出[4]:
['58729', '58708']
您可以使用regex提取方括号之间的值,然后将值拆分成一个列表
守则:
结果是:
>>%运行string2list.py
['58729', '58708']
>>>%Run string2list.py
在我看来,最好的办法是将正则表达式与小型解析器结合起来:
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
import re
data = """
[58729 58708]
[58729]
[58708]
[58729]
"""
# outer expression
rx = re.compile(r'\[[^\[\]]+\]')
# nodevisitor class
class StringVisitor(NodeVisitor):
grammar = Grammar(
r"""
list = lpar content+ rpar
content = item ws?
item = ~"[^\[\]\s]+"
ws = ~"\s+"
lpar = "["
rpar = "]"
"""
)
def generic_visit(self, node, visited_children):
return visited_children or node
def visit_content(self, node, visited_children):
item, _ = visited_children
return item.text
def visit_list(self, node, visited_children):
_, content, _ = visited_children
return [item for item in content]
sv = StringVisitor()
for lst in rx.finditer(data):
real_list = sv.parse(lst.group(0))
print(real_list)
这将产生
['58729', '58708']
['58729']
['58708']
['58729']
'[58729 58708]'[1:-1].split()
['58729', '58708']
['58729']
['58708']
['58729']