Python 基于特定单词拆分字符串列表
我有一个包含特定文件路径的列表,例如Python 基于特定单词拆分字符串列表,python,list,Python,List,我有一个包含特定文件路径的列表,例如 path_list =['Animals/dog_00238/2D_rendering/131/view/full/0', 'Animals/dog_00238/2D_rendering/131/view/full/1', 'Animals/dog_00238/2D_rendering/131/view/full/2', 'Animals/dog_00238/2D_rendering/906271/
path_list =['Animals/dog_00238/2D_rendering/131/view/full/0',
'Animals/dog_00238/2D_rendering/131/view/full/1',
'Animals/dog_00238/2D_rendering/131/view/full/2',
'Animals/dog_00238/2D_rendering/906271/view/full/1',
'Animals/dog_00239/2D_rendering/906271/view/full/2',
'Animals/dog_00239/2D_rendering/965947/view/full/0',
'Animals/dog_00239/2D_rendering/965947/view/full/1',
'Animals/dog_00239/2D_rendering/965947/view/full/2',<------ Last time dog_00239 appears #first break point
'Animals/dog_00240/2D_rendering/965947/view/full/3',
'Animals/dog_00240/2D_rendering/982160/view/full/0',
'Animals/dog_00240/2D_rendering/982160/view/full/1',
'Animals/dog_00241/2D_rendering/982160/view/full/2',
'Animals/dog_00241/2D_rendering/141/view/full/0',
'Animals/dog_00241/2D_rendering/141/view/full/1',<------ Last time dog_00241 appears #Second breakpoint
'Animals/dog_00242/2D_rendering/1881/view/full/3',
'Animals/dog_00242/2D_rendering/1881/view/full/4',
'Animals/dog_00242/2D_rendering/2487/view/full/0',
'Animals/dog_00242/2D_rendering/2487/view/full/1',]
到目前为止,我尝试创建一个for循环,并在满足if条件时将其中断,然后查看得到的结果,例如:
new_paths = []
fisrt_breakpoint = 'dog_00239'
for i in range (len(path_list)):
new_paths.append(path_list[i])
if fisrt_breakpoint in path_list:
break
这会分割列表并保留其中的一小部分,但它在新路径中保留的是我设置的断点之后的元素,而不是该断点之前的元素。但我也没有设置条件,因为我不知道如何做,在这个单词最后一次出现在这个列表中时用作断点,因为我假设即使上面的代码工作,它也会在第一次看到单词“dog_00239”时将其拆分
我也试过:
new_paths = [x for x in path_list if first_breakpoint in x]
但这会得到所有包含“dog_00239”的项目,而我想做的是得到所有到这一点的东西
提前感谢您抽出时间,
我非常感谢您的帮助,如果有任何错误,或者我对我的问题不太清楚,我会非常抱歉,因为我对这里和Python都是新手。您可以使用
itertools.groupby
导入itertools
进口稀土
#这是您定义范围的函数
def get_断点_id(x):
如果(x对于不涉及导入包的替代解决方案,可以使用嵌套For循环:
断点=[“dog_00240”,“dog_00242”]
路径列表=['Animals/dog_00238/2D_rendering/131/view/full/0',
“动物/狗”\u 00238/2D\u渲染/131/view/full/1”,
“动物/dog_00238/2D_渲染/131/view/full/2”,
“动物/dog_00238/2D_渲染/906271/view/full/1”,
“动物/dog_00239/2D_渲染/906271/view/full/2”,
“动物/dog_00239/2D_渲染/965947/view/full/0”,
“动物/dog_00239/2D_渲染/965947/view/full/1”,
“动物/dog_00239/2D_渲染/965947/view/full/2”,
“动物/dog_00240/2D_渲染/965947/view/full/3”,
“动物/dog_00240/2D_渲染/982160/view/full/0”,
“动物/dog_00240/2D_渲染/982160/view/full/1”,
“动物/dog_00241/2D_渲染/982160/view/full/2”,
“动物/dog_00241/2D_渲染/141/view/full/0”,
“动物/dog_00241/2D_渲染/141/view/full/1”,
“动物/dog_00242/2D_渲染/1881/view/full/3”,
“动物/dog_00242/2D_渲染/1881/view/full/4”,
“动物/dog_00242/2D_渲染/2487/view/full/0”,
“动物/dog_00242/2D_渲染/2487/view/full/1”,]
列表=[]
新列表=[]
对于路径列表中的文件路径:
对于断点中的点:
#如果点是断点,则向列表中添加新的\u列表
如果文件路径中的点:
列表。追加(新列表)
新列表=[]
#删除断点
断点。删除(点)
持续
#否则,将路径添加到新的\u列表
新增列表。追加(文件路径)
#将最终新列表添加到列表中
列表。追加(新列表)
第一个=列表[0]
第二个=列表[1]
第三个=列表[2]
打印(第一、第二、第三)
#印刷品
“(['Animals/dog_00238/2D_rendering/131/view/full/0',
“动物/狗”\u 00238/2D\u渲染/131/view/full/1”,
“动物/dog_00238/2D_渲染/131/view/full/2”,
“动物/dog_00238/2D_渲染/906271/view/full/1”,
“动物/dog_00239/2D_渲染/906271/view/full/2”,
“动物/dog_00239/2D_渲染/965947/view/full/0”,
“动物/dog_00239/2D_渲染/965947/view/full/1”,
“动物/dog_00239/2D_渲染/965947/view/full/2”],
['Animals/dog_00240/2D_rendering/965947/view/full/3',
“动物/dog_00240/2D_渲染/982160/view/full/0”,
“动物/dog_00240/2D_渲染/982160/view/full/1”,
“动物/dog_00241/2D_渲染/982160/view/full/2”,
“动物/dog_00241/2D_渲染/141/view/full/0”,
“动物/狗”\u 00241/2D\u渲染/141/view/full/1'],
['Animals/dog_00242/2D_rendering/1881/view/full/3',
“动物/dog_00242/2D_渲染/1881/view/full/4”,
“动物/dog_00242/2D_渲染/2487/view/full/0”,
“动物/dog_00242/2D_rendering/2487/view/full/1'])”
请注意,我将断点设置为第一个要中断的路径,而不是到达其最后一个实例后要中断的路径。简单而紧凑的解决方案:
breakpoints_numbers = [0,239,241,999]
breakpoints_paths = [f'Animals/dog_00{b}' for b in breakpoints_numbers]
pathslist = [sublist for sublist in [[p for p in path_list if p>b[0] and p<=b[1]] for b in zip(breakpoints_paths[:-1],breakpoints_paths[1:])] if sublist != []]
断点数=[0239241999]
断点路径=[f'Animals/dog_00{b}用于断点编号中的b]
pathslist=[sublist for sublist in[[p for p in path_list,如果p>b[0]和pI刚刚尝试了您的解决方案,但您提出的确切示例给出了该结果[['Animals/dog_00238/2D_rendering/131/view/full/0','Animals/dog_00238/2D_rendering/131/view/full/1','Animals/dog_00238/2D_rendering/131/view/full/2','Animals/dog_00239/2D_rendering/906271/view/full/2','Animals/dog_00242/2D_rendering/1881/view/full/3','Animals/dog_00242/2D_rendering/2D_ng/1881/view/full/4',Animals/dog_00242/2D_rendering/2487/view/full/0',Animals/dog_00242/2D_rendering/2487/view/full/1'],[],[],[],[]因此它创建了一个包含所有元素的新列表,末尾两个为空。您的右边是空的子列表。为了避免它们,我用cleanup更新了答案。版本添加了在[SydiistPaskStutsEnvutySudisSt]中,对于子列表的子列表,如果它看起来相当不必要的话,它不仅移除了最后的空的,而且在中间也有空的。至少对我来说,我不知道。谢谢你的更新,但是<代码> PosiStList= [Palistor中的子列表[P]在PATH列表中,如果P>B[0 ]。P4刚刚尝试了您的解决方案,它确实可以完全按照预期工作,但我根本没有与itertools.groupby一起工作,因此我想问您一些问题。首先,新列表存储在哪里?我如何访问它们并在之后对其执行其他操作?还有..(int(re.findall(r'dog_(\d+/),x)[0]))在u(\ d+/)之后,这是如何工作的,因为到目前为止我理解它,但是,x?。再次感谢您可以将输出存储在一个新的列表中output=[list(g)for g in itertools.groupby(path_list,key=
new_paths = []
fisrt_breakpoint = 'dog_00239'
for i in range (len(path_list)):
new_paths.append(path_list[i])
if fisrt_breakpoint in path_list:
break
new_paths = [x for x in path_list if first_breakpoint in x]
[['Animals/dog_00238/2D_rendering/131/view/full/0',
'Animals/dog_00238/2D_rendering/131/view/full/1',
'Animals/dog_00238/2D_rendering/131/view/full/2',
'Animals/dog_00238/2D_rendering/906271/view/full/1',
'Animals/dog_00239/2D_rendering/906271/view/full/2',
'Animals/dog_00239/2D_rendering/965947/view/full/0',
'Animals/dog_00239/2D_rendering/965947/view/full/1',
'Animals/dog_00239/2D_rendering/965947/view/full/2'],
['Animals/dog_00240/2D_rendering/965947/view/full/3',
'Animals/dog_00240/2D_rendering/982160/view/full/0',
'Animals/dog_00240/2D_rendering/982160/view/full/1',
'Animals/dog_00241/2D_rendering/982160/view/full/2',
'Animals/dog_00241/2D_rendering/141/view/full/0',
'Animals/dog_00241/2D_rendering/141/view/full/1'],
['Animals/dog_00242/2D_rendering/1881/view/full/3',
'Animals/dog_00242/2D_rendering/1881/view/full/4',
'Animals/dog_00242/2D_rendering/2487/view/full/0',
'Animals/dog_00242/2D_rendering/2487/view/full/1']]
breakpoints_numbers = [0,239,241,999]
breakpoints_paths = [f'Animals/dog_00{b}' for b in breakpoints_numbers]
pathslist = [sublist for sublist in [[p for p in path_list if p>b[0] and p<=b[1]] for b in zip(breakpoints_paths[:-1],breakpoints_paths[1:])] if sublist != []]