Python 使用至少n个空格拆分字符串_Python_Python 3.x_String_List_Split

Python 使用至少n个空格拆分字符串

python python-3.x string list

Python 使用至少n个空格拆分字符串,python,python-3.x,string,list,split,Python,Python 3.x,String,List,Split,我有以下数据，我无法更改： data = """ -5,-2 -52.565 -5,-1 -48.751 -5, 0 -47.498 -5, 1 -48.751 - -5, 2 -52.565 """ 我想将这些列分为两个列表，即： list1 = ['-5,-2','-5,-1','-5, 0','-5, 1','-5, 2'] list2 = ['-52.565,'

我有以下数据，我无法更改：

data = """
-5,-2   -52.565           
-5,-1   -48.751           
-5, 0   -47.498           
-5, 1   -48.751          - 
-5, 2   -52.565          
"""

我想将这些列分为两个列表，即：

list1 = ['-5,-2','-5,-1','-5, 0','-5, 1','-5, 2']
list2 = ['-52.565,'-48.751','-47.498','-48.751','-52.565']

现在，我有兴趣正确分割每一行：

lines = [l for l in s.splitlines()]

print(lines[2].split())
print(lines[3].split())

['-5，-1'，'-48.751']

['-5'，'0'，'-47.498]

您可以看到第[3]行没有正确拆分，因为在“-5”和“0”之间有一个空格。为了解决这个问题，我尝试了以下方法（基于）：

第一列条目'-5，0'成功，但它也会在末尾添加一个空列表条目：

['-5,0'，'-47.498'，']

我怎样才能解决这个问题？也许有更好的分手方式

编辑：

如果我使用

print(re.split(r'\s{2,}', lines[3],  maxsplit = 1))

我得到：

['-5,0'，'-47.498']

尝试在拆分之前剥离线条：

print(re.split(r'\s{2,}', lines[3].strip(),  maxsplit = 1)) 
#or 
print(re.split(r'\s{2,}', lines[3].strip()))

使用

re

将行拆分为列，然后使用

zip

函数将列拆分为两组：

重新导入
data=”“”
-5,-2   -52.565           
-5,-1   -48.751           
-5, 0   -47.498           
-5, 1   -48.751          - 
-5, 2   -52.565          
"""
columns=[re.split（'\s{2，}'，line.strip（））用于数据中的行。如果line.strip（），则为splitlines（）]
打印（列）
第一，第二=映射（列表，zip（*列））
打印（第一）
打印（秒）

输出：

[['-5,-2', '-52.565'], ['-5,-1', '-48.751'], ['-5, 0', '-47.498'], ['-5, 1', '-48.751', '-'], ['-5, 2', '-52.565']]
['-5,-2', '-5,-1', '-5, 0', '-5, 1', '-5, 2']
['-52.565', '-48.751', '-47.498', '-48.751', '-52.565']

试试这个

data = """
    -5,-2   -52.565           
    -5,-1   -48.751           
    -5, 0   -47.498           
    -5, 1   -48.751          - 
    -5, 2   -52.565          
"""
lines = [l.strip() for l in data.splitlines()]
list1 = []
list2 = []
for line in lines:
    if not line:
        continue
    columns = line.split(' '*3)
    list1.append(columns[0])
    list2.append(columns[1])
print(list1)
print(list2)

如果您想使用非正则表达式、过于复杂且更为冗长的方式进行操作，您可以这样做：

data = """
-5,-2   -52.565           
-5,-1   -48.751           
-5, 0   -47.498           
-5, 1   -48.751          - 
-5, 2   -52.565          
"""

jumbled_fields = data.split("\n")

divided = list()
for n in range(len(jumbled_fields)):
    for split_field in jumbled_fields[n].split("   "):
        if split_field != "" and split_field[0] != " ":
            divided.append(split_field)

first = list()
second = list()
for n in range(len(divided)):
    if n % 2 == 0:
        first.append(divided[n])
    else:
        second.append(divided[n])
print(first)  # ['-5,-2', '-5,-1', '-5, 0', '-5, 1', '-5, 2']
print(second)  # ['-52.565', '-48.751', '-47.498', '-48.751', '-52.565']

正则表达式的解释（很抱歉，我正试图在电话上输入这种格式）

获取一个附加参数maxslit-将其设置为1，您将始终拆分为两列。我建议，然后运行以删除拖尾whitespaces@Arthur.V啊，这就是我要问你的。好的，我试试看。：）第二个数字后面有空格，这就是为什么有空字符串。您可以使用

.strip（）

删除数字前后的空格。@mx0 True，但很遗憾，我无法更改数据：（只需保留通过

re.split

获得的列表的前两项即可。）。。。

data = """
-5,-2   -52.565           
-5,-1   -48.751           
-5, 0   -47.498           
-5, 1   -48.751          - 
-5, 2   -52.565          
"""

jumbled_fields = data.split("\n")

divided = list()
for n in range(len(jumbled_fields)):
    for split_field in jumbled_fields[n].split("   "):
        if split_field != "" and split_field[0] != " ":
            divided.append(split_field)

first = list()
second = list()
for n in range(len(divided)):
    if n % 2 == 0:
        first.append(divided[n])
    else:
        second.append(divided[n])
print(first)  # ['-5,-2', '-5,-1', '-5, 0', '-5, 1', '-5, 2']
print(second)  # ['-52.565', '-48.751', '-47.498', '-48.751', '-52.565']

import re

data = """
-5,-2   -52.565           
-5,-1   -48.751           
-5, 0   -47.498           
-5, 1   -48.751          - 
-5, 2   -52.565          
"""

patternA = re.compile('(-\d+,[\s|-]\d+)')
matches = re.findall(patternA, data)
listA = []

for elem in matches:
    listA.append(elem)


patternB = re.compile('\s(-\d+\.\d+)')
matches = re.findall(patternB, data)
listB = []

for elem in matches:
    listB.append(elem)

print(listA)
print(listB)

In re.compile:
\d+   -   Number with one or more digits
\s     -   Matches single whitespace
[\s|-]  -  Matches whitespace or -
( )      -  Captures group, this part is returned by the findall