Python:在列表中拆分字符串的优雅且节省代码的方法

Python:在列表中拆分字符串的优雅且节省代码的方法,python,list,optimization,coding-style,Python,List,Optimization,Coding Style,我有一个字符串: mydata 'POINT (558750.3267372231900000 6361788.0628051758000000)' 我希望一个代码保存的方式转换为数字列表 (g, (x,y)) 其中: g = geometry (POINT) x = coordinates x y = coordinates y 我正在使用 mydata.split(" ") ['POINT', '(558750.3267372231900000', '6361788.06280517

我有一个字符串:

mydata
'POINT (558750.3267372231900000 6361788.0628051758000000)'
我希望一个代码保存的方式转换为数字列表

(g, (x,y)) 
其中:

g = geometry (POINT)
x = coordinates x
y = coordinates y
我正在使用

mydata.split(" ")
['POINT', '(558750.3267372231900000', '6361788.0628051758000000)']
但在那之后,我需要使用几个代码行来获得x和y

v = mydata.split()
g = v[0]
x = float(v[1].strip('('))
y = float(v[2].strip(')'))
(g, (x, y))

代码保存是的,使用regex不太优雅:

In [59]: g,[x,y]=re.findall(r"[A-Za-z]+",mydata)[0],
                       [float(x) for x in re.findall(r"[\d+.]+",mydata)]

In [60]: g
Out[60]: 'POINT'

In [61]: x
Out[61]: 558750.3267372232

In [62]: y
Out[62]: 6361788.062805176
使用
str.strip()
str.split()

逐步:

>>> s = 'POINT (558750.3267372231900000 6361788.0628051758000000)'
>>> word, points = s.split(None, 1)
>>> word
'POINT'
>>> points
'(558750.3267372231900000 6361788.0628051758000000)'
>>> points = points.strip('()').split()
>>> points
['558750.3267372231900000', '6361788.0628051758000000']
>>> x, y = (float(i) for i in points)
>>> x
558750.3267372232
>>> y
6361788.062805176

Regex可以节省您在此处键入的时间:

In [1]: import re

In [2]: def nice_tuple(s):                                                    
    g, x, y, _ = re.split(' ?[()]?', s)
    return g, tuple(map(float, (x, y)))
   ...: 

In [3]: nice_tuple('POINT (558750.3267372231900000 6361788.0628051758000000)')
Out[3]: ('POINT', (558750.3267372232, 6361788.062805176))

如果您的数据始终采用这种精确格式,则很容易:

>>> def parse_data(d):
    geom, xs, ys = d.split()
    return (geom, (float(xs[1:]), float(ys[:-1])))

>>> mydata
'POINT (558750.3267372231900000 6361788.0628051758000000)'
>>> parse_data(mydata)
('POINT', (558750.32673722319, 6361788.0628051758))

这可能是最短的,我不知道优雅。

我会用
.translate
.split

In [35]: mydata='POINT (558750.3267372231900000 6361788.0628051758000000)'

In [39]: data=mydata.split(None,1)

In [40]: data
Out[40]: ['POINT', '(558750.3267372231900000 6361788.0628051758000000)']

In [41]: g,[x,y]=data[0], map(lambda x: float(x.strip("()")), data[1].split())

In [42]: g,x,y
Out[42]: ('POINT', 558750.3267372232, 6361788.062805176)
In [126]: mydata = 'POINT (558750.3267372231900000 6361788.0628051758000000)'

In [127]: mysplitdata = mydata.translate(None, '()').split()

In [128]: mysplitdata
Out[128]: ['POINT', '558750.3267372231900000', '6361788.0628051758000000']

In [129]: g,x,y = mysplitdata[0],float(mysplitdata[1]),float(mysplitdata[2])

In [130]: outdata = (g, (x,y))

In [131]: outdata
Out[131]: ('POINT', (558750.32673722319, 6361788.0628051758))

最近,我用python创建了一个应用程序,在那里我做了几乎相同的事情。下面是我创建的用于解析wkt文件的类


希望你觉得有用。有关用法,请参见第136行。您也可以使用该类来读取线字符串和多行字符串。

关于将数据存储在列表中作为点对象,shapely提供了解析点字符串的方法,您仍然需要将
x
y
转换为
float
In [126]: mydata = 'POINT (558750.3267372231900000 6361788.0628051758000000)'

In [127]: mysplitdata = mydata.translate(None, '()').split()

In [128]: mysplitdata
Out[128]: ['POINT', '558750.3267372231900000', '6361788.0628051758000000']

In [129]: g,x,y = mysplitdata[0],float(mysplitdata[1]),float(mysplitdata[2])

In [130]: outdata = (g, (x,y))

In [131]: outdata
Out[131]: ('POINT', (558750.32673722319, 6361788.0628051758))