Regex svndumpool的正则表达式

Regex svndumpool的正则表达式,regex,svn,migration,repository,svn-externals,Regex,Svn,Migration,Repository,Svn Externals,我有一个大型(30+GB)的遗留SVN repo,其中定义了大量需要克隆到新服务器的外部。由于repo最初是在svnv1.5之前的版本中创建的,因此它有许多外部定义,这些外部定义使用了引用旧服务器名称的绝对路径。我想删除所有绝对路径,并使它们成为相对路径,以便迁移能够正常工作 我发现通过这个,它在一些外部代码上工作得很好,但我还没有找到一个正则表达式来处理其余的代码 下面是我通过运行命令在repo中找到的六种不同类型的外部定义的例子:svnpropget--recursive svn:exter

我有一个大型(30+GB)的遗留SVN repo,其中定义了大量需要克隆到新服务器的外部。由于repo最初是在svnv1.5之前的版本中创建的,因此它有许多外部定义,这些外部定义使用了引用旧服务器名称的绝对路径。我想删除所有绝对路径,并使它们成为相对路径,以便迁移能够正常工作

我发现通过这个,它在一些外部代码上工作得很好,但我还没有找到一个正则表达式来处理其余的代码

下面是我通过运行命令在repo中找到的六种不同类型的外部定义的例子:
svnpropget--recursive svn:externals%REPODIR\u FILE%/%repo%

CaseA https://svn.acme.com/svn/test/branches/project.x
CaseB -r 19 https://svn.acme.com/svn/test/branches/project.y
https://svn.acme.com/svn/test/branches/project.z CaseC
-r 20 https://svn.acme.com/svn/test/branches/project.z@20 CaseD
CaseE  https://svn.acme.com/svn/test/branches/project.x CaseE
CaseF -r21  https://svn.acme.com/svn/test/branches/project.y
注意,CaseE与CaseA相同,只是https前面有两个空格

注意,CaseF几乎与CaseB相同,除了-r和标记号之间的空格以及https之前的双空格

我正在使用测试我的正则表达式,目前我正在使用以下表达式:

^(\S+) (|-r ?\d* ?)https:\/\/svn.acme.com(\S+)
这给了我:

Match 1
1.  CaseA
2.   
3.  /svn/test/branches/project.x
Match 2
1.  CaseB
2.  -r 19
3.  /svn/test/branches/project.y
我还没有找到一个正则表达式,它可以将案例C和D解析为如下内容:

Match 3
1.  /svn/test/branches/project.z
2.  
3.  CaseC
Match 4
1.  -r 20
2.  /svn/test/branches/project.z@20
3.  CaseD
svndumptool似乎要求我分离外部定义的不同组件,以便它能够以正确的(svnv1.5)语法正确地重新组装它


任何来自REGEX神的帮助都将不胜感激:-)

因为您使用的是Ruby,所以这里有两种选择。但是,您的机器上还有其他正则表达式吗

第一选择(绝对路径和3个匹配) 演示


第二选择(相对路径和多个匹配) 演示

以下是我发现对我有用的一组命令,希望这能帮助将来试图修复损坏的SVN回购的人。记住,朋友不要让朋友使用绝对的外在语言

此过程在前六次迭代中将外部列表从超过30K个已定义外部减少到仅30个已定义外部

:: List of types of externals we need to deal with
CaseA https://svn.acme.com/svn/test/branches/project.x
CaseB -r 19 https://svn.acme.com/svn/test/branches/project.y
https://svn.acme.com/svn/test/branches/project.z CaseC
-r 20 https://svn.acme.com/svn/test/branches/project.z@20 CaseD
CaseE  https://svn.acme.com/svn/test/branches/project.x
CaseF -r21  https://svn.acme.com/svn/test/branches/project.y

:: SVN Dump Tool
SET SVNDUMPTOOL=C:\support\svndumptool\v0.6.1\svndumptool.exe
SET REPODIR=D:\Repositories
SET REPODIR_FILE=file:///D:/Repositories
SET DUMPDIR=D:\Dumps
SET REPO=test
SET SVN="C:\Program Files (x86)\VisualSVN Server\bin\svn.exe"
SET SVNADMIN="C:\Program Files (x86)\VisualSVN Server\bin\svnadmin.exe"
SET CREATE=%SVNADMIN% create
SET LOAD=%SVNADMIN% load --ignore-uuid
SET DUMP=%SVNADMIN% dump

:: Get a list of the externals in the original repo
svn propget --recursive svn:externals %REPODIR_FILE%/%REPO%>%DUMPDIR%\%REPO%.externals

:: Dump the repo
%DUMP% %REPODIR%\%REPO% > %DUMPDIR%\%REPO%.dump

:: Transform the repo
:: CaseA
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+) https://svn.acme.com(\S+)" "\2 \1" %DUMPDIR%\%REPO%.dump %DUMPDIR%\%REPO%_A.dump
:: Delete the dump to save disk space, each dump file iteration is ~300GB
DEL %DUMPDIR%\%REPO%.dump
:: CaseB
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+) (-r ?\d* ?)https://svn.acme.com(\S+)" "\2\3 \1" %DUMPDIR%\%REPO%_A.dump %DUMPDIR%\%REPO%_AB.dump
DEL %DUMPDIR%\%REPO%_A.dump
:: CaseC
%SVNDUMPTOOL% transform-prop svn:externals "^(\S*)https://svn.acme.com(\S*)" "\2\1" %DUMPDIR%\%REPO%_AB.dump %DUMPDIR%\%REPO%_ABC.dump
DEL %DUMPDIR%\%REPO%_AB.dump
:: CaseD
%SVNDUMPTOOL% transform-prop svn:externals "^(-r ?\d* ?)(\S+) https://svn.acme.com(\S+)" "\1\2 \3" %DUMPDIR%\%REPO%_ABC.dump %DUMPDIR%\%REPO%_ABCD.dump
DEL %DUMPDIR%\%REPO%_ABC.dump
:: CaseE
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+)  https://svn.acme.com(\S+)" "\2 \1" %DUMPDIR%\%REPO%_ABCD.dump %DUMPDIR%\%REPO%_ABCDE.dump
DEL %DUMPDIR%\%REPO%_ABCD.dump
:: CaseF
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+) (-r ?\d* ?)  https://svn.acme.com(\S+)" "\2\3 \1" %DUMPDIR%\%REPO%_ABCDE.dump %DUMPDIR%\%REPO%_ABCDEF.dump
DEL %DUMPDIR%\%REPO%_ABCDE.dump

:: Delete the old repo
RMDIR /Q /S %REPODIR%\%REPO%
:: Create a new clean repo
%CREATE% %REPODIR%\%REPO%
:: Load the fixed dump
%LOAD% %REPODIR%\%REPO% < %DUMPDIR%\%REPO%_ABCDEF.dump
:: Get the new list of externals
%SVN% propget --recursive svn:externals %REPODIR_FILE%/%REPO%>%DUMPDIR%\%REPO%_ABCDEF.externals
::我们需要处理的外部类型列表
凯萨https://svn.acme.com/svn/test/branches/project.x
案例B-r 19https://svn.acme.com/svn/test/branches/project.y
https://svn.acme.com/svn/test/branches/project.z 案例c
-r 20https://svn.acme.com/svn/test/branches/project.z@20箱
凯西https://svn.acme.com/svn/test/branches/project.x
案例F-r21https://svn.acme.com/svn/test/branches/project.y
::SVN转储工具
设置svndumpool=C:\support\svndumpool\v0.6.1\svndumpool.exe
SET REPODIR=D:\Repositories
设置REPODIR\u文件=file:///D:/Repositories
SET DUMPDIR=D:\Dumps
集合回购=测试
SET SVN=“C:\Program Files(x86)\VisualSVN Server\bin\SVN.exe”
SET SVNADMIN=“C:\Program Files(x86)\VisualSVN Server\bin\SVNADMIN.exe”
设置创建=%SVNADMIN%CREATE
设置加载=%SVNADMIN%LOAD--忽略uuid
设置转储=%SVNADMIN%DUMP
::获取原始回购协议中的外部信息列表
svn propget--递归svn:外部%REPODIR\u文件%/%REPO%>%DUMPDIR%\%REPO%.externals
::放弃回购协议
%转储%%REPODIR%\%REPO%>%DUMPDIR%\%REPO%.DUMP
::转换回购协议
::案例a
%SVNDUMPTOOL%transform prop svn:externals“^(\S+)https://svn.acme.com(\S+)“\2\1”%DUMPDIR%\%REPO%.dump%DUMPDIR%\%REPO%\u A.dump
::删除转储以节省磁盘空间,每次转储文件迭代约300GB
删除%DUMPDIR%\%REPO%.dump
::案例b
%SVNDUMPTOOL%transform prop svn:externals“^(\S+)(-r?\d*?)https://svn.acme.com(\S+)“\2\3\1”%DUMPDIR%\%REPO%\u A.dump%DUMPDIR%\%REPO%\u AB.dump
删除%DUMPDIR%\%REPO%\u A.dump
::案例c
%SVNDUMPTOOL%transform prop svn:externals“^(\S*)https://svn.acme.com(\S*)“”\2\1”%DUMPDIR%\%REPO%\u AB.dump%DUMPDIR%\%REPO%\u ABC.dump
删除%DUMPDIR%\%REPO%\u AB.dump
::装箱
%SVNDUMPTOOL%transform prop svn:externals“^(-r?\d*?)(\S+)https://svn.acme.com(\S+)“\1\2\3”%DUMPDIR%\%REPO%\u ABC.dump%DUMPDIR%\%REPO%\u ABC.dump
删除%DUMPDIR%\%REPO%\u ABC.dump
::凯西
%SVNDUMPTOOL%transform prop svn:externals“^(\S+)https://svn.acme.com(\S+)“\2\1”%DUMPDIR%\%REPO%\u dump.dump%DUMPDIR%\%REPO%\u ABCDE.dump
删除%DUMPDIR%\%REPO%\u dump.dump
::CaseF
%SVNDUMPTOOL%transform prop svn:externals“^(\S+)(-r?\d*?)https://svn.acme.com(\S+)“\2\3\1”%DUMPDIR%\%REPO%\u ABCDE.dump%DUMPDIR%\%REPO%\u ABCDEF.dump
删除%DUMPDIR%\%REPO%\u ABCDE.dump
::删除旧回购协议
RMDIR/Q/S%REPODIR%\%REPO%
::创建新的干净回购
%创建%%REPODIR%\%REPO%
::加载固定转储
%加载%%REPODIR%\%REPO%<%DUMPDIR%\%REPO%\u ABCDEF.dump
::获取新的外部列表
%SVN%propget--递归SVN:externals%REPODIR\u FILE%/%REPO%>%DUMPDIR%\%REPO%\u ABCDEF.externals

如果有人在这里使用Python:

import re

test_externals ="""
CaseA https://svn.acme.com/svn/test/branches/project.x
CaseB -r 19 https://svn.acme.com/svn/test/branches/project.y
https://svn.acme.com/svn/test/branches/proje_9ct.z/123 CaseC1
https://svn.acme.com/svn/test/branches/proje_9ct.z/123   CaseC2
https://svn.acme.com/svn/test/branches/proje_9ct.z/123    CaseC3
https://svn.acme.com/svn/test/branches/project.zCaseC4
-r 20 https://svn.acme.com/svn/test/branches/project.z@20 CaseD1
-r27 https://svn.acme.com/svn/test/branches/project.z@27 CaseD2
-r37 https://svn.acme.com/svn/test/branches/project.z CaseD3
https://svn.acme.com/svn/test/branches/project.z@88 CaseD4
 -r 20 https://svn.acme.com/svn/test/branches/project.z@20 CaseD1
CaseE -r21  https://svn.acme.com/svn/test/branches/project.y
"""

pat_url    = r'(?P<url>https?://(?:[a-zA-Z0-9\._-]+)(?:[a-zA-Z0-9\._-/]+))'
pat_folder = r'(?P<folder>[a-zA-Z0-9/\.-_]+)'
pat_pegrev = r'(?:@(?P<peg_revision>\d+))'
pat_oprev  = r'(?:-r\s?(?P<op_rev>\d+))'

regex_externals = {
    'CaseA': re.compile(r'^\s*{folder}\s{url}$'.format(folder=pat_folder, url=pat_url)),
    'CaseB': re.compile(r'^\s*{folder}\s{oprev}\s{url}$'.format(folder=pat_folder, oprev=pat_oprev, url=pat_url)),
    'CaseC': re.compile(r'^\s*{url}\s{folder}$'.format(folder=pat_folder, url=pat_url)),
    'CaseD': re.compile(r'^\s*{oprev}?\s{url}{pegrev}?\s*{folder}$'.format(folder=pat_folder, oprev=pat_oprev, pegrev=pat_pegrev, url=pat_url)),
}

for r in regex_externals: print('%s: %s' %(r, regex_externals[r].pattern))


for case in test_externals.split('\n'):
for pat in regex_externals:
    match = re.search(regex_externals[pat], case)
    if match:
        print('\n\n%s: %s' %(pat, case))
        for g in match.groups():
            print '\t%s' % g
重新导入
test_externals=“”
凯萨https://svn.acme.com/svn/test/branches/project.x
案例B-r 19https://svn.acme.com/svn/test/branches/project.y
https://svn.acme.com/svn/test/branches/proje_9ct.z/123 案例1
https://svn.acme.com/svn/test/branches/proje_9ct.z/123   案例2
https://svn.acme.com/svn/test/branches/proje_9ct.z/123    案例3
https://svn.acme.com/svn/test/branches/project.zCaseC4
-r 20https://svn.acme.com/svn/test/branches/project.z@20宗个案1
-r27https://svn.acme.com/svn/test/branches/project.z@27宗个案2
-r37https://svn.acme.com/svn/test/branches/project.z 案例3
https://svn.acme.com/svn/test/branches/project.z@88宗个案4
-r 20https://svn.acme.com/svn/test/branches/project.z@20宗个案1
CaseE-r21https://svn.acme.com/svn/test/branches/project.y
"""
pat_url=r'(?Phttps?:/(?:[a-zA-Z0-9\.\u-]+)(?:[a-zA-Z0-9\.\u-/])”
pat_folder=r'(?P[a-zA-Z0-9/\.-\.-\]+)
pat_pegrev=r'(?:@(?P\d+))
pat_oprev=r'(?:-r\s?(?P\d+))
正则表达式外部={
'CaseA':重新编译(r'^\s*{folder}\s{url}$'。格式(folder=pat_folder,url=pat_url)),
“CaseB”:重新编译(r'^\s*{folder}\s{oprev}\s{url}$'。格式(folder=pat_folder,oprev=pat_oprev,url=pat_url)),
“CaseC”:重新编译(r'^\s*{url}\s{folder}$'。格式(folder=pat_folder,url=pat_url)),
“CaseD”:重新编译(r'^\s*{oprev}?\s{url}{pegrev}?\s*{folder}$'。格式(folder=pat_folder,oprev=pat_oprev,
:: List of types of externals we need to deal with
CaseA https://svn.acme.com/svn/test/branches/project.x
CaseB -r 19 https://svn.acme.com/svn/test/branches/project.y
https://svn.acme.com/svn/test/branches/project.z CaseC
-r 20 https://svn.acme.com/svn/test/branches/project.z@20 CaseD
CaseE  https://svn.acme.com/svn/test/branches/project.x
CaseF -r21  https://svn.acme.com/svn/test/branches/project.y

:: SVN Dump Tool
SET SVNDUMPTOOL=C:\support\svndumptool\v0.6.1\svndumptool.exe
SET REPODIR=D:\Repositories
SET REPODIR_FILE=file:///D:/Repositories
SET DUMPDIR=D:\Dumps
SET REPO=test
SET SVN="C:\Program Files (x86)\VisualSVN Server\bin\svn.exe"
SET SVNADMIN="C:\Program Files (x86)\VisualSVN Server\bin\svnadmin.exe"
SET CREATE=%SVNADMIN% create
SET LOAD=%SVNADMIN% load --ignore-uuid
SET DUMP=%SVNADMIN% dump

:: Get a list of the externals in the original repo
svn propget --recursive svn:externals %REPODIR_FILE%/%REPO%>%DUMPDIR%\%REPO%.externals

:: Dump the repo
%DUMP% %REPODIR%\%REPO% > %DUMPDIR%\%REPO%.dump

:: Transform the repo
:: CaseA
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+) https://svn.acme.com(\S+)" "\2 \1" %DUMPDIR%\%REPO%.dump %DUMPDIR%\%REPO%_A.dump
:: Delete the dump to save disk space, each dump file iteration is ~300GB
DEL %DUMPDIR%\%REPO%.dump
:: CaseB
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+) (-r ?\d* ?)https://svn.acme.com(\S+)" "\2\3 \1" %DUMPDIR%\%REPO%_A.dump %DUMPDIR%\%REPO%_AB.dump
DEL %DUMPDIR%\%REPO%_A.dump
:: CaseC
%SVNDUMPTOOL% transform-prop svn:externals "^(\S*)https://svn.acme.com(\S*)" "\2\1" %DUMPDIR%\%REPO%_AB.dump %DUMPDIR%\%REPO%_ABC.dump
DEL %DUMPDIR%\%REPO%_AB.dump
:: CaseD
%SVNDUMPTOOL% transform-prop svn:externals "^(-r ?\d* ?)(\S+) https://svn.acme.com(\S+)" "\1\2 \3" %DUMPDIR%\%REPO%_ABC.dump %DUMPDIR%\%REPO%_ABCD.dump
DEL %DUMPDIR%\%REPO%_ABC.dump
:: CaseE
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+)  https://svn.acme.com(\S+)" "\2 \1" %DUMPDIR%\%REPO%_ABCD.dump %DUMPDIR%\%REPO%_ABCDE.dump
DEL %DUMPDIR%\%REPO%_ABCD.dump
:: CaseF
%SVNDUMPTOOL% transform-prop svn:externals "^(\S+) (-r ?\d* ?)  https://svn.acme.com(\S+)" "\2\3 \1" %DUMPDIR%\%REPO%_ABCDE.dump %DUMPDIR%\%REPO%_ABCDEF.dump
DEL %DUMPDIR%\%REPO%_ABCDE.dump

:: Delete the old repo
RMDIR /Q /S %REPODIR%\%REPO%
:: Create a new clean repo
%CREATE% %REPODIR%\%REPO%
:: Load the fixed dump
%LOAD% %REPODIR%\%REPO% < %DUMPDIR%\%REPO%_ABCDEF.dump
:: Get the new list of externals
%SVN% propget --recursive svn:externals %REPODIR_FILE%/%REPO%>%DUMPDIR%\%REPO%_ABCDEF.externals
import re

test_externals ="""
CaseA https://svn.acme.com/svn/test/branches/project.x
CaseB -r 19 https://svn.acme.com/svn/test/branches/project.y
https://svn.acme.com/svn/test/branches/proje_9ct.z/123 CaseC1
https://svn.acme.com/svn/test/branches/proje_9ct.z/123   CaseC2
https://svn.acme.com/svn/test/branches/proje_9ct.z/123    CaseC3
https://svn.acme.com/svn/test/branches/project.zCaseC4
-r 20 https://svn.acme.com/svn/test/branches/project.z@20 CaseD1
-r27 https://svn.acme.com/svn/test/branches/project.z@27 CaseD2
-r37 https://svn.acme.com/svn/test/branches/project.z CaseD3
https://svn.acme.com/svn/test/branches/project.z@88 CaseD4
 -r 20 https://svn.acme.com/svn/test/branches/project.z@20 CaseD1
CaseE -r21  https://svn.acme.com/svn/test/branches/project.y
"""

pat_url    = r'(?P<url>https?://(?:[a-zA-Z0-9\._-]+)(?:[a-zA-Z0-9\._-/]+))'
pat_folder = r'(?P<folder>[a-zA-Z0-9/\.-_]+)'
pat_pegrev = r'(?:@(?P<peg_revision>\d+))'
pat_oprev  = r'(?:-r\s?(?P<op_rev>\d+))'

regex_externals = {
    'CaseA': re.compile(r'^\s*{folder}\s{url}$'.format(folder=pat_folder, url=pat_url)),
    'CaseB': re.compile(r'^\s*{folder}\s{oprev}\s{url}$'.format(folder=pat_folder, oprev=pat_oprev, url=pat_url)),
    'CaseC': re.compile(r'^\s*{url}\s{folder}$'.format(folder=pat_folder, url=pat_url)),
    'CaseD': re.compile(r'^\s*{oprev}?\s{url}{pegrev}?\s*{folder}$'.format(folder=pat_folder, oprev=pat_oprev, pegrev=pat_pegrev, url=pat_url)),
}

for r in regex_externals: print('%s: %s' %(r, regex_externals[r].pattern))


for case in test_externals.split('\n'):
for pat in regex_externals:
    match = re.search(regex_externals[pat], case)
    if match:
        print('\n\n%s: %s' %(pat, case))
        for g in match.groups():
            print '\t%s' % g