Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/337.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 每秒出现未知元素后拆分字符串_Python_String_Split_Polygon - Fatal编程技术网

Python 每秒出现未知元素后拆分字符串

Python 每秒出现未知元素后拆分字符串,python,string,split,polygon,Python,String,Split,Polygon,我有一个代表多边形的坐标列表字符串。 在此列表中,每个多边形都有相同的起始坐标和结束坐标。 我需要有在单独的字符串(或列表)每个多边形 “17.17165756225586-28.10226440296875,17.1843700040893555-28.200496673583984,17.1986083984375-28.223613739013672,17.17165756225586-28.10226440296875, 28.865726470947266-28.76161956787

我有一个代表多边形的坐标列表字符串。 在此列表中,每个多边形都有相同的起始坐标和结束坐标。 我需要有在单独的字符串(或列表)每个多边形

17.17165756225586-28.10226440296875,17.1843700040893555-28.200496673583984,17.1986083984375-28.223613739013672,17.17165756225586-28.1022644029687528.865726470947266-28.761619567871094,28.80694007873535-28.75750160217285,28.792499542236328-28.70694732660156,28.865726470947266-28.7616195671094

从这个简单的例子中,我需要两个元素:

一是17.17165756225586-28.10226440296875,17.1843700040893555-28.200496673583984,17.1986083984375-28.223613739013672,17.17165756225586-28.10226440296875

二是
28.865726470947266-28.761619567871094,28.80694007873535-28.75750160217285,28.792499542236328-28.70694732660156,28.865726470947266-28.7616195671094*

字符串中可能有更多多边形,每个多边形都需要分离。
我只能使用标准的python库来实现这一点

如何使用每个“,”来拆分长字符串,并将其放入数组中。然后做一个for循环并执行以下操作:

intStart = 0;
if (array[intStart] == array[i]){
    for(j=0; j<i; j++){
        string += array[j];
    }
    arrPolygons.push(string);
    intStart = i+1;
}
intStart=0;
if(数组[intStart]==数组[i]){

for(j=0;j用每个“,”拆分长字符串并将其放入一个数组。然后做一个for循环并执行以下操作:

intStart = 0;
if (array[intStart] == array[i]){
    for(j=0; j<i; j++){
        string += array[j];
    }
    arrPolygons.push(string);
    intStart = i+1;
}
intStart=0;
if(数组[intStart]==数组[i]){

对于(j=0;j,这里有一个相当难看但有效的解决方案,只是将显而易见的方法真正应用到代码中

# Note that your string has inconsistent separators -- sometimes ',', sometimes ', '.
# I'm going to separate on `,` and not worry about it -- you need to work out
# what the correct separator is.
s = '17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875, 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094'

coordinates = s.split(',')

polygon = []
polygons = []

new = True

for coordinate in coordinates:
    polygon.append(coordinate)

    if new:
        start = coordinate
        new = False

    elif coordinate == start:
        polygons.append(polygon)
        polygon = []
        new = True

result = [",".join(polygon) for polygon in polygons]
print(result)

Out:
['17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875', ' 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094']

这是一个相当丑陋但有效的解决方案,只是将显而易见的方法真正地应用到代码中

# Note that your string has inconsistent separators -- sometimes ',', sometimes ', '.
# I'm going to separate on `,` and not worry about it -- you need to work out
# what the correct separator is.
s = '17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875, 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094'

coordinates = s.split(',')

polygon = []
polygons = []

new = True

for coordinate in coordinates:
    polygon.append(coordinate)

    if new:
        start = coordinate
        new = False

    elif coordinate == start:
        polygons.append(polygon)
        polygon = []
        new = True

result = [",".join(polygon) for polygon in polygons]
print(result)

Out:
['17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875', ' 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094']
输出:

[['17.17165756225586 -28.102264404296875', '17.184370040893555 -28.200496673583984', '17.1986083984375 -28.223613739013672', '17.17165756225586 -28.102264404296875'], [' 28.865726470947266 -28.761619567871094', '28.80694007873535 -28.75750160217285', '28.792499542236328 -28.706947326660156', ' 28.865726470947266 -28.761619567871094']]
输出:

[['17.17165756225586 -28.102264404296875', '17.184370040893555 -28.200496673583984', '17.1986083984375 -28.223613739013672', '17.17165756225586 -28.102264404296875'], [' 28.865726470947266 -28.761619567871094', '28.80694007873535 -28.75750160217285', '28.792499542236328 -28.706947326660156', ' 28.865726470947266 -28.761619567871094']]
输入数据

lst = [
    '17.17165756225586 -28.102264404296875',
    '17.184370040893555 -28.200496673583984',
    ...
    '17.1986083984375 -28.223613739013672',
    '17.17165756225586 -28.102264404296875',
    '28.865726470947266 -28.761619567871094',
    ...
    '28.80694007873535 -28.75750160217285',
    '28.792499542236328 -28.706947326660156',
    '28.865726470947266 -28.761619567871094',
]

lst1 = []
for cord in lst:
    if cord not in lst1:
        lst1.append(cord)
print(lst1)
输出:

[
    '17.17165756225586 -28.102264404296875',
    '17.184370040893555 -28.200496673583984',
    '17.1986083984375 -28.223613739013672',
    '28.865726470947266 -28.761619567871094',
    '28.80694007873535 -28.75750160217285',
    '28.792499542236328 -28.706947326660156',
    '28.865726470947266 -28.761619567871094',
]
输入数据

lst = [
    '17.17165756225586 -28.102264404296875',
    '17.184370040893555 -28.200496673583984',
    ...
    '17.1986083984375 -28.223613739013672',
    '17.17165756225586 -28.102264404296875',
    '28.865726470947266 -28.761619567871094',
    ...
    '28.80694007873535 -28.75750160217285',
    '28.792499542236328 -28.706947326660156',
    '28.865726470947266 -28.761619567871094',
]

lst1 = []
for cord in lst:
    if cord not in lst1:
        lst1.append(cord)
print(lst1)
输出:

[
    '17.17165756225586 -28.102264404296875',
    '17.184370040893555 -28.200496673583984',
    '17.1986083984375 -28.223613739013672',
    '28.865726470947266 -28.761619567871094',
    '28.80694007873535 -28.75750160217285',
    '28.792499542236328 -28.706947326660156',
    '28.865726470947266 -28.761619567871094',
]

由于您的输入已经是一个字符串(以及您的预期结果?),因此您可以使用带有反向引用的
([^,]+).*\2)
尝试此超级惰性解决方案。这里,
[^,]+
是第一个坐标对,
*
其他坐标对,而
\2
是第一个坐标对

>>> s = '17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875, 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094'
>>> re.findall(r"(([^,]+).*\2)", s)
[('17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875',
  '17.17165756225586 -28.102264404296875'),
 (' 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094',
  ' 28.865726470947266 -28.761619567871094')]
或者使用
finditer
并获取
直接获取字符串列表:

>>> [m.group() for m in re.finditer(r"(([^,]+).*\2)", s)]
['17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875',
 ' 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094']
经过一些后处理后,要获得成对数字的实际列表(其中
findall
的结果;对于
finditer
,请删除
[0]
):


对于较长的字符串,这可能不是最快的解决方案,但我没有计时。

由于您的输入已经是字符串(以及您的预期结果?),您可以使用带有反向引用的
([^,]+).\2)
尝试此超级惰性解决方案。此处,
[^,]+
是第一对坐标,
*
是其他坐标对,
\2
是第一对坐标对

>>> s = '17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875, 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094'
>>> re.findall(r"(([^,]+).*\2)", s)
[('17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875',
  '17.17165756225586 -28.102264404296875'),
 (' 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094',
  ' 28.865726470947266 -28.761619567871094')]
或者使用
finditer
并获取
直接获取字符串列表:

>>> [m.group() for m in re.finditer(r"(([^,]+).*\2)", s)]
['17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875',
 ' 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094']
经过一些后处理后,要获得成对数字的实际列表(其中
findall
的结果;对于
finditer
,请删除
[0]
):


对于较长的字符串,这可能不是最快的解决方案,但我没有计时。

我非常喜欢@newbie的简洁解决方案。下面是一个更详细/可读的解决方案:

s = '17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875, 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094'
vertices = [c.strip() for c in s.split(",")] # split and clean vertex data

polygons = []           
current_polygon = None

for vertex in vertices:
    if current_polygon is None:             # start a new polygon
        current_polygon = [vertex]
    elif current_polygon[0] == vertex:      # conclude the current polygon
        current_polygon.append(vertex)
        polygons.append(current_polygon)
        current_polygon = None
    else:                                   # continue the current polygon
        current_polygon.append(vertex)

for polygon in polygons:    # print polygons
    print(",".join(polygon))

我非常喜欢@newbie的简洁解决方案。这里有一个更详细/可读的解决方案:

s = '17.17165756225586 -28.102264404296875,17.184370040893555 -28.200496673583984,17.1986083984375 -28.223613739013672,17.17165756225586 -28.102264404296875, 28.865726470947266 -28.761619567871094,28.80694007873535 -28.75750160217285,28.792499542236328 -28.706947326660156, 28.865726470947266 -28.761619567871094'
vertices = [c.strip() for c in s.split(",")] # split and clean vertex data

polygons = []           
current_polygon = None

for vertex in vertices:
    if current_polygon is None:             # start a new polygon
        current_polygon = [vertex]
    elif current_polygon[0] == vertex:      # conclude the current polygon
        current_polygon.append(vertex)
        polygons.append(current_polygon)
        current_polygon = None
    else:                                   # continue the current polygon
        current_polygon.append(vertex)

for polygon in polygons:    # print polygons
    print(",".join(polygon))
递归方法:

def split_polygons(s):
    if s == '':  # base case
        return []
    start, rest = s.split(',', 1)
    head, tail = map(lambda x: x.strip(', '), rest.split(start, 1))
    poly = start + ',' + head + start  # reconstruct the first polygon
    return [poly] + split_polygons(tail)

递归方法:

def split_polygons(s):
    if s == '':  # base case
        return []
    start, rest = s.split(',', 1)
    head, tail = map(lambda x: x.strip(', '), rest.split(start, 1))
    poly = start + ',' + head + start  # reconstruct the first polygon
    return [poly] + split_polygons(tail)


这是另一种方法,这种方法适用于任何字符串长度,只要它基于您提供的输入格式

strng = "17.17165756225586,-28.102264404296875,17.184370040893555,-28.200496673583984,17.1986083984375,-28.223613739013672,17.17165756225586,-28.102264404296875,28.865726470947266,-28.761619567871094,28.80694007873535,-28.75750160217285,28.792499542236328,-28.706947326660156,28.865726470947266,-28.761619567871094"
#convert to list of tuples
l_tuple = zip(*[iter(strng.split(','))]*2)
#get list of duplicate indexes
l_index=[]
for Tuple in l_tuple:
    x = [i for i,x in enumerate(l_tuple) if x == Tuple]
    if len(x)>1:
        l_index.append(x)
#get separate lists
New_list = []
for IND in list(set(map(tuple,l_index))):
    print(l_tuple[IND[0]:IND[1]+1])
    New_list.append(l_tuple[IND[0]:IND[1]+1])

这是另一种方法,这种方法适用于任何字符串长度,只要它基于您提供的输入格式

strng = "17.17165756225586,-28.102264404296875,17.184370040893555,-28.200496673583984,17.1986083984375,-28.223613739013672,17.17165756225586,-28.102264404296875,28.865726470947266,-28.761619567871094,28.80694007873535,-28.75750160217285,28.792499542236328,-28.706947326660156,28.865726470947266,-28.761619567871094"
#convert to list of tuples
l_tuple = zip(*[iter(strng.split(','))]*2)
#get list of duplicate indexes
l_index=[]
for Tuple in l_tuple:
    x = [i for i,x in enumerate(l_tuple) if x == Tuple]
    if len(x)>1:
        l_index.append(x)
#get separate lists
New_list = []
for IND in list(set(map(tuple,l_index))):
    print(l_tuple[IND[0]:IND[1]+1])
    New_list.append(l_tuple[IND[0]:IND[1]+1])


粗体是每个多边形(开始/结束)的相同点。哦,有人请格式化它。我在赶时间。一旦解释了粗体的含义,我认为它的格式很好。困难在哪里,你读第一对并扫描直到它再次出现?是的,每对都用逗号分隔。粗体是每个多边形(开始/结束)的相同点哦,有人请格式化它。我正在赶时间。我认为一旦解释了粗体的含义,格式化就很好了。困难在哪里,你读第一对,然后扫描直到它再次出现?是的,每对都用逗号分隔。你为什么要用另一种语言发布答案?如果有的话,这只是混淆了问题为什么你要用另一种语言发布答案另一种语言?如果有什么区别的话,这只是混淆了问题我不认为我们能在这个问题上逃脱“丑陋”:)我不认为我们能在这个问题上逃脱“丑陋”:这完全没有击中目标。您没有创建坐标对或拆分列表,您只是以一种没有意义的方式对其进行过滤。这完全没有击中目标。您没有创建坐标对或拆分列表,您只是以一种没有意义的方式对其进行过滤。很好,我提出了
r'([\-\d\.]+).*“
,晚了15分钟。是否有可能直接获取字符串列表而不是字符串元组列表?它似乎可以工作,必须在具有更多多边形的字符串上进行测试。@EricDuminil我尝试了非捕获组,但使用
findall
这似乎是唯一的方法。您也可以使用
[m.group()对于re.finditer(…)]
中的m,虽然只是为了获得字符串列表。很好,我想出了
r'([\-\d\.]+).*2)“
,晚了15分钟。是否有可能直接获取字符串列表而不是字符串元组列表?它似乎可以工作,必须在具有更多多边形的字符串上进行测试。@EricDuminil我尝试了非捕获组,但使用
findall
这似乎是唯一的方法。您也可以使用
[m.group()对于re.finditer(…)]
中的m,只需获取字符串列表。