如何添加一个;在使用Python 3的给定实例之后
因此,我有一个txt文件,其中包含以下内容:如何添加一个;在使用Python 3的给定实例之后,python,regex,python-3.x,Python,Regex,Python 3.x,因此,我有一个txt文件,其中包含以下内容: CREATE EXTERNAL TABLE `table1`( `tab_id bigint COMMENT 'The unique identifier of thetable') ROW FORMAT SERDE * STORED AS INPUTFORMAT * OUTPUTFORMAT * LOCATION * TBLPROPERTIES ( 'transient_lastDdlTime'='1556u3ehw27
CREATE EXTERNAL TABLE `table1`(
`tab_id bigint COMMENT 'The unique identifier of thetable')
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1556u3ehw27')
CREATE EXTERNAL TABLE `aud2`(
`application_id` bigint COMMENT 'Unique Id that represents each application created')
COMMENT 'contains application level details. every application will have one entry'
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1trh7')
我试图写一个程序,插入一个;特别是在TBLProperty后面的最后一个括号之后。因此,输出应该如下所示:
CREATE EXTERNAL TABLE `table1`(
`tab_id bigint COMMENT 'The unique identifier of thetable')
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1556u3ehw27');
CREATE EXTERNAL TABLE `audit_application`(
`application_id` bigint COMMENT 'Unique Id that represents each application created')
COMMENT 'contains application level details. every application will have one entry'
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1trh7');
这是我的声明代码,但没有运行,但它会删除最后一个声明之后的所有内容),这不是我想要的:
f = open("/home/files", 'rt', encoding='latin-1')
source=f.read()
with open("/home/files/sampl8.sql","w") as output:
output.write(source[:source.find(')')+1].replace('"', ''))
有什么想法或建议吗?您可以使用regex找到确切的字符串,并将其替换为如下所示。使用
设置
只是为了确保我们没有替换重复项
import re
t = '''
CREATE EXTERNAL TABLE `table1`(
`tab_id bigint COMMENT 'The unique identifier of thetable')
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1556u3ehw27')
CREATE EXTERNAL TABLE `aud2`(
`application_id` bigint COMMENT 'Unique Id that represents each application created')
COMMENT 'contains application level details. every application will have one entry'
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1trh7')
'''
for i in set(re.findall(r'TBLPROPERTIES \(.*?\)', t, flags=re.DOTALL)):
t = t.replace(i, i + ';')
print(t)
您可以使用
re.sub
查找tblproperty()
的所有实例,并将其替换为tblproperty()代码>:
如果此函数的关键字arguemnt无效,则会出现编码错误。你知道这是为什么吗?在文件行上抛出的错误是否打开了?完全错误是什么?您还可以尝试删除第3行的编码
参数,因为您可能没有收到使用拉丁编码的内容
import re
f = open("/home/files", 'rt', encoding='latin-1')
source=f.read()
with open("/home/files/sampl8.sql","w") as output:
output.write(re.sub(r'(TBLPROPERTIES \(.*?\))', r'\1;', f, flags=re.DOTALL))