Python 使用至少n个空格拆分字符串
我有以下数据,我无法更改:Python 使用至少n个空格拆分字符串,python,python-3.x,string,list,split,Python,Python 3.x,String,List,Split,我有以下数据,我无法更改: data = """ -5,-2 -52.565 -5,-1 -48.751 -5, 0 -47.498 -5, 1 -48.751 - -5, 2 -52.565 """ 我想将这些列分为两个列表,即: list1 = ['-5,-2','-5,-1','-5, 0','-5, 1','-5, 2'] list2 = ['-52.565,'
data = """
-5,-2 -52.565
-5,-1 -48.751
-5, 0 -47.498
-5, 1 -48.751 -
-5, 2 -52.565
"""
我想将这些列分为两个列表,即:
list1 = ['-5,-2','-5,-1','-5, 0','-5, 1','-5, 2']
list2 = ['-52.565,'-48.751','-47.498','-48.751','-52.565']
现在,我有兴趣正确分割每一行:
lines = [l for l in s.splitlines()]
print(lines[2].split())
print(lines[3].split())
['-5,-1','-48.751']
['-5','0','-47.498]
您可以看到第[3]行没有正确拆分,因为在“-5”和“0”之间有一个空格。为了解决这个问题,我尝试了以下方法(基于):
第一列条目'-5,0'成功,但它也会在末尾添加一个空列表条目:
['-5,0','-47.498',']
我怎样才能解决这个问题?也许有更好的分手方式
编辑:
如果我使用
print(re.split(r'\s{2,}', lines[3], maxsplit = 1))
我得到:
['-5,0','-47.498']
尝试在拆分之前剥离线条:
print(re.split(r'\s{2,}', lines[3].strip(), maxsplit = 1))
#or
print(re.split(r'\s{2,}', lines[3].strip()))
使用
re
将行拆分为列,然后使用zip
函数将列拆分为两组:
重新导入
data=”“”
-5,-2 -52.565
-5,-1 -48.751
-5, 0 -47.498
-5, 1 -48.751 -
-5, 2 -52.565
"""
columns=[re.split('\s{2,}',line.strip())用于数据中的行。如果line.strip(),则为splitlines()]
打印(列)
第一,第二=映射(列表,zip(*列))
打印(第一)
打印(秒)
输出:
[['-5,-2', '-52.565'], ['-5,-1', '-48.751'], ['-5, 0', '-47.498'], ['-5, 1', '-48.751', '-'], ['-5, 2', '-52.565']]
['-5,-2', '-5,-1', '-5, 0', '-5, 1', '-5, 2']
['-52.565', '-48.751', '-47.498', '-48.751', '-52.565']
试试这个
data = """
-5,-2 -52.565
-5,-1 -48.751
-5, 0 -47.498
-5, 1 -48.751 -
-5, 2 -52.565
"""
lines = [l.strip() for l in data.splitlines()]
list1 = []
list2 = []
for line in lines:
if not line:
continue
columns = line.split(' '*3)
list1.append(columns[0])
list2.append(columns[1])
print(list1)
print(list2)
如果您想使用非正则表达式、过于复杂且更为冗长的方式进行操作,您可以这样做:
data = """
-5,-2 -52.565
-5,-1 -48.751
-5, 0 -47.498
-5, 1 -48.751 -
-5, 2 -52.565
"""
jumbled_fields = data.split("\n")
divided = list()
for n in range(len(jumbled_fields)):
for split_field in jumbled_fields[n].split(" "):
if split_field != "" and split_field[0] != " ":
divided.append(split_field)
first = list()
second = list()
for n in range(len(divided)):
if n % 2 == 0:
first.append(divided[n])
else:
second.append(divided[n])
print(first) # ['-5,-2', '-5,-1', '-5, 0', '-5, 1', '-5, 2']
print(second) # ['-52.565', '-48.751', '-47.498', '-48.751', '-52.565']
正则表达式的解释(很抱歉,我正试图在电话上输入这种格式)
获取一个附加参数maxslit-将其设置为1,您将始终拆分为两列。我建议,然后运行以删除拖尾whitespaces@Arthur.V啊,这就是我要问你的。好的,我试试看。:)第二个数字后面有空格,这就是为什么有空字符串。您可以使用
.strip()
删除数字前后的空格。@mx0 True,但很遗憾,我无法更改数据:(只需保留通过re.split
获得的列表的前两项即可。)。。。
data = """
-5,-2 -52.565
-5,-1 -48.751
-5, 0 -47.498
-5, 1 -48.751 -
-5, 2 -52.565
"""
jumbled_fields = data.split("\n")
divided = list()
for n in range(len(jumbled_fields)):
for split_field in jumbled_fields[n].split(" "):
if split_field != "" and split_field[0] != " ":
divided.append(split_field)
first = list()
second = list()
for n in range(len(divided)):
if n % 2 == 0:
first.append(divided[n])
else:
second.append(divided[n])
print(first) # ['-5,-2', '-5,-1', '-5, 0', '-5, 1', '-5, 2']
print(second) # ['-52.565', '-48.751', '-47.498', '-48.751', '-52.565']
import re
data = """
-5,-2 -52.565
-5,-1 -48.751
-5, 0 -47.498
-5, 1 -48.751 -
-5, 2 -52.565
"""
patternA = re.compile('(-\d+,[\s|-]\d+)')
matches = re.findall(patternA, data)
listA = []
for elem in matches:
listA.append(elem)
patternB = re.compile('\s(-\d+\.\d+)')
matches = re.findall(patternB, data)
listB = []
for elem in matches:
listB.append(elem)
print(listA)
print(listB)
In re.compile:
\d+ - Number with one or more digits
\s - Matches single whitespace
[\s|-] - Matches whitespace or -
( ) - Captures group, this part is returned by the findall