比较两个字符串并提取Python中变量数据的值_Python_Regex_String Comparison

比较两个字符串并提取Python中变量数据的值

python regex

比较两个字符串并提取Python中变量数据的值,python,regex,string-comparison,Python,Regex,String Comparison,在我的python脚本中，我有一个字符串列表，比如 birth_year = ["my birth year is *","i born in *","i was born in *"] 我想将一个输入句子与上面的列表进行比较，并需要一个出生年份作为输出输入的句子如下： Example1: My birth year is 1994. Example2: I born in 1995 输出将是： Example1: 1994 Example2: 1995 我通过使用regex应用了许多

在我的python脚本中，我有一个字符串列表，比如

birth_year = ["my birth year is *","i born in *","i was born in *"]

我想将一个输入句子与上面的列表进行比较，并需要一个出生年份作为输出

输入的句子如下：

Example1: My birth year is 1994.
Example2: I born in 1995

输出将是：

Example1: 1994
Example2: 1995

我通过使用regex应用了许多方法。但我没有找到一个完美的解决方案

str1=My birth year is 1994.
str2=str1.replace('My birth year is ','')

您可以尝试这样做，并用空字符串替换不必要的字符串

对于共享的代码，您可以执行以下操作：

for x in examples:
   for y in birth_year:
      if x.find(y)==1: #checking if the substring exists in example
         x.replace(y,'') #if it exists we replace it with empty string

我认为上面的代码可能有效

您可以尝试这样做，并用空字符串替换不必要的字符串

对于共享的代码，您可以执行以下操作：

for x in examples:
   for y in birth_year:
      if x.find(y)==1: #checking if the substring exists in example
         x.replace(y,'') #if it exists we replace it with empty string

我认为上面的代码可能会工作

如果你能保证这些字符串总是包含一个4位数的数字，也就是出生年份，在那里的某个地方。。。我会说，只要使用正则表达式，就可以得到其中由非数字包围的任何4位数字。相当愚蠢，但是，嘿，处理你的数据

import re

examples = ["My birth year is 1993.", "I born in 1995", "я родился в 1976м году"]
for str in examples:
    y = int(re.findall(r"^[^\d]*([\d]{4})[^\d]*$", str)[0])
    print(y)

import re

examples = ["My birth year is 1993.", "I born in 1995", "я родился в 1976м году"]
for str in examples:
    y = int(re.findall(r"^[^\d]*([\d]{4})[^\d]*$", str)[0])
    print(y)

如果将出生年份更改为正则表达式列表，则可以更容易地与输入字符串匹配。使用年度捕获组

这里有一个函数可以满足您的需要：

def match_year(birth_year, input):  
    for s in birth_year:
        m = re.search(s, input, re.IGNORECASE)
        if m:
            output = f'{input[:m.start(0)]}{m[1]}'
            print(output)
            break

例如：

birth_year = ["my birth year is (\d{4})","i born in (\d{4})","i was born in (\d{4})"]

match_year(birth_year, "Example1: My birth year is 1994.")
match_year(birth_year, "Example2: I born in 1995")

输出：

Example1: 1994
Example2: 1995

f-strings至少需要Python 3.6。

如果将出生年份更改为正则表达式列表，则可以更轻松地与输入字符串匹配。使用年度捕获组

这里有一个函数可以满足您的需要：

def match_year(birth_year, input):  
    for s in birth_year:
        m = re.search(s, input, re.IGNORECASE)
        if m:
            output = f'{input[:m.start(0)]}{m[1]}'
            print(output)
            break

例如：

birth_year = ["my birth year is (\d{4})","i born in (\d{4})","i was born in (\d{4})"]

match_year(birth_year, "Example1: My birth year is 1994.")
match_year(birth_year, "Example2: I born in 1995")

输出：

Example1: 1994
Example2: 1995

f字符串至少需要Python 3.6。

如果只提取数字，可以使用re.findallr'\d+'，val[0]如果只提取数字，可以使用re.findallr'\d+'，val[0]@Hiral你能给我举个例子吗？年份不是浮动值。也许你应该问一个新问题？@Hiral你能给我举个例子吗？年份不是浮动值。也许你应该问一个新问题？