Regex 删除仅包含数字的行-正则表达式_Regex_Python 2.7

Regex 删除仅包含数字的行-正则表达式

regex python-2.7

Regex 删除仅包含数字的行-正则表达式,regex,python-2.7,Regex,Python 2.7,我对python和正则表达式都是新手。我试图处理一个文本文件，其中我想删除只有数字和空格的行。这是我正在使用的正则表达式 ^\s*[0-9]*\s*$ 我能够匹配要删除的行（在记事本++查找对话框中）但是当我尝试用python做同样的事情时，行是不匹配的。正则表达式本身有问题吗？或者我的python代码有问题吗我正在使用的Python代码： contacts = re.sub(r'^\s*[0-9]*\s*$','\n',contents) 示例文本 Age:30 Gender:Male

我对python和正则表达式都是新手。我试图处理一个文本文件，其中我想删除只有数字和空格的行。这是我正在使用的正则表达式

^\s*[0-9]*\s*$

我能够匹配要删除的行（在记事本++查找对话框中）

但是当我尝试用python做同样的事情时，行是不匹配的。正则表达式本身有问题吗？或者我的python代码有问题吗

我正在使用的Python代码：

contacts = re.sub(r'^\s*[0-9]*\s*$','\n',contents)

示例文本

Age:30
Gender:Male



20 


Name:संगीता शर्मा
HusbandsName:नरेश कुमार शर्मा
HouseNo:10/183
30 30
Gender:Female


21 

Name:मोनू शर्मा
FathersName:कैलाश शर्मा
HouseNo:10/183
30
Gender:Male

在多行模式下使用

re.sub

：

contacts = re.sub(r'^\s*([0-9]+\s*)+$','\n',x, flags=re.M)

如果希望开始的

和结束的

锚定生效，则需要处于多行模式

此外，使用以下表达式表示仅包含数字簇的行，可能用空格分隔：

^\s*([0-9]+\s*)+$

在多行模式下使用

re.sub

：

contacts = re.sub(r'^\s*([0-9]+\s*)+$','\n',x, flags=re.M)

如果希望开始的

和结束的

锚定生效，则需要处于多行模式

此外，使用以下表达式表示仅包含数字簇的行，可能用空格分隔：

^\s*([0-9]+\s*)+$

您甚至不需要正则表达式，一个简单的方法可以删除您不感兴趣的字符，并检查是否还有剩余字符，这就足够了：

import string

clear_chars = string.digits + string.whitespace  # a map of characters we'd like to check for

# open input.txt for reading, out.txt for writing
with open("input.txt", "rb") as f_in, open("output.txt", "wb") as f_out:
    for line in f_in:  # iterate over the input file line by line
        if line.translate(None, clear_chars):  # remove the chars, check if anything is left
            f_out.write(line)  # write the line to the output file
        # uncomment the following if you want added newlines when pattern matched
        # else:
        #     f_out.write("\n")  # write a new line on match

这将为您的示例输入生成：

Age:30 Gender:Male Name:संगीता शर्मा HusbandsName:नरेश कुमार शर्मा HouseNo:10/183 Gender:Female Name:मोनू शर्मा FathersName:कैलाश शर्मा HouseNo:10/183 Gender:Male 年龄:30 性别:男姓名：संगीता शर्मा 丈夫姓名：नरेश कुमार शर्मा 编号：10/183 性别：女姓名：मोनू शर्मा 父亲姓名：कैलाश शर्मा 编号：10/183 性别:男

如果您想用新行替换匹配行，只需取消注释

else

子句。

您甚至不需要正则表达式，一个简单的删除不感兴趣的字符并检查是否有剩余字符的方法就足够了：

import string

clear_chars = string.digits + string.whitespace  # a map of characters we'd like to check for

# open input.txt for reading, out.txt for writing
with open("input.txt", "rb") as f_in, open("output.txt", "wb") as f_out:
    for line in f_in:  # iterate over the input file line by line
        if line.translate(None, clear_chars):  # remove the chars, check if anything is left
            f_out.write(line)  # write the line to the output file
        # uncomment the following if you want added newlines when pattern matched
        # else:
        #     f_out.write("\n")  # write a new line on match

这将为您的示例输入生成：

else

子句即可