Python 将一个添加到字符串中与某个正则表达式匹配的数字_Python_Regex

Python 将一个添加到字符串中与某个正则表达式匹配的数字

python regex

Python 将一个添加到字符串中与某个正则表达式匹配的数字,python,regex,Python,Regex,我有一个Python字符串，看起来像这样： "5 pounds cauliflower, cut into 1-inch florets (about 18 cups) 2 large leeks, 1 teaspoons salt 3 cups of milk" import re pattern=r'cups?' string_1="""5 pounds cauliflower, cut into 1-inch florets (about 18 cups) 2 large leeks,

我有一个Python字符串，看起来像这样：

"5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks,
1 teaspoons salt
3 cups of milk"

import re
pattern=r'cups?'
string_1="""5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks,
1 teaspoons salt
3 cups of milk"""

jk=string_1.splitlines()
for i in jk:
    wow=i.split()

    for l,k in enumerate(wow):
        if (re.search(pattern,k))!=None:
            wow[l-1]=int(wow[l-1])+1

    print(" ".join([str(i) for i in wow]))

我需要在关键字

cup

前面出现的每个数字上加1

结果必须是：

"5 pounds cauliflower,
cut into 1-inch florets (about 19 cups)
2 large leeks,
1 teaspoons salt
4 cups of milk"

我有一些大致的想法：

import re

p = re.compile('([0-9]+) cup')
for i in p.finditer(s):
    # do something with int(i.group(1)) + 1

我不知道如何只替换我在每次迭代中找到的数字

我也有一个边缘情况，我可能需要将9替换为10，因此我不能简单地获取数字的索引并用新数字替换该数字，因为新数字可能更长

也欢迎不涉及正则表达式的解决方案。

Code

用法

用法2

此代码是对以下问题的评论的回应您可以尝试以下一行解决方案：

import re
s = """
5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks,
1 teaspoons salt
3 cups of milk
"""
new_s = re.sub('\d+(?=\s[a-zA-Z])', '{}', s).format(*[int(re.findall('^\d+', i)[0])+1 if re.findall('[a-zA-Z]+$', i)[0] == 'cups' else int(re.findall('^\d+', i)[0]) for i in re.findall('\d+\s[a-zA-Z]+', s)])
print(new_s)

输出：

5 pounds cauliflower,
cut into 1-inch florets (about 19 cups)
2 large leeks,
1 teaspoons salt
4 cups of milk

5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks,
1 teaspoons salt
3 cups of milk

5 pounds cauliflower,
cut into 1-inch florets (about 19 cups)
2 large leeks,
1 teaspoons salt
4 cups of milk

您可以将函数作为替换字符串传递给

sub

函数。此函数接收作为参数的

处理收到的参数以为每个匹配创建替换字符串

感谢@ctwheels的回答，我改进了我的初始正则表达式处理

为了处理单词的复数化（正如@casimirithippolyte所要求的），我们可以使用更广泛的模式，但需要稍微复杂一些的替换函数：

def repl(x):
    d = int(x.group(0).split()[0]) + 1
    return str(d) + ' cup' if d == 1 else str(d) + ' cups'

p = r'\d+ cups?'


mystring = """
5 pounds cauliflower,
cut into 1-inch florets (about 19 cups)
2 large leeks,
1 teaspoons salt
4 cups of milk
1 cup of butter
0 cups of sugar"""


newstring = re.sub(p, repl, mystring)
print(newstring)
# outputs
5 pounds cauliflower,
cut into 1-inch florets (about 20 cups)
2 large leeks,
1 teaspoons salt
5 cups of milk
2 cups of butter
1 cup of sugar

也不是正则表达式：

def tryParseInt(i):
    try:
        num = int(i)
    except:
        return (False,i)
    return (True,num)

txt = '''5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks, 
1 teaspoons salt 
3 cups of milk'''

txt2 =  txt.replace("\n"," \n ").split(" ") # add a space before newline to allow splitting
                                           # at spaces to keep newlines in-lined 
txt3 = ""   # result

for n in range(len(txt2)-1):
    prev, current =  txt2[n:n+2]
    if (current == "cup" or current == "cups" or current == "cups)"):
        isint, n = tryParseInt(prev)
        if isint:
            prev = str(n+1) 

        txt3 = txt3.strip() + " " + prev

    elif prev is not None:
        txt3 = txt3 + " " + prev

txt3 += " " + current

print(txt3.replace(" \n ","\n"))

也不是正则表达式（这是第一次尝试）：

输出：

5 pounds cauliflower,
cut into 1-inch florets (about 19 cups)
2 large leeks,
1 teaspoons salt
4 cups of milk

5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks,
1 teaspoons salt
3 cups of milk

5 pounds cauliflower,
cut into 1-inch florets (about 19 cups)
2 large leeks,
1 teaspoons salt
4 cups of milk

您可以尝试以下方法：

"5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks,
1 teaspoons salt
3 cups of milk"

import re
pattern=r'cups?'
string_1="""5 pounds cauliflower,
cut into 1-inch florets (about 18 cups)
2 large leeks,
1 teaspoons salt
3 cups of milk"""

jk=string_1.splitlines()
for i in jk:
    wow=i.split()

    for l,k in enumerate(wow):
        if (re.search(pattern,k))!=None:
            wow[l-1]=int(wow[l-1])+1

    print(" ".join([str(i) for i in wow]))

输出：

5 pounds cauliflower,
cut into 1-inch florets (about 19 cups)
2 large leeks,
1 teaspoons salt
4 cups of milk

使用带有函数（最终为lambda函数）的

re.finditer

，而不是

re.sub

。看这个问题：下一个挑战是把

1杯

变成

2杯

。这将成为一种厚厚的花椰菜粥！我将在

p=r'\d+（？=+cups？\b）

中的

cups

之后保留

？

，因为这里没有要求区分

cups

和

cups

。而且

\b

对我来说没有意义。