Python 将一列中的NaN值替换为另一列中的正则表达式_Python_Regex_For Loop_Nan

Python 将一列中的NaN值替换为另一列中的正则表达式

python regex for-loop

Python 将一列中的NaN值替换为另一列中的正则表达式,python,regex,for-loop,nan,Python,Regex,For Loop,Nan,这是我正在处理的数据的一个小例子： df = pd.DataFrame({'EntryStreetName': ['Palm Avenue', NaN, 'Peachtree Street'], 'ExitStreetName': [NaN, 'Palm Avenue', 'Mitchell Street'], 'Path': ['Palm Avenue_NW_Mitchell Street', 'Mitchell Street_SE_Palm Avenue', 'Peachtr

这是我正在处理的数据的一个小例子：

df = pd.DataFrame({'EntryStreetName': ['Palm Avenue', NaN, 'Peachtree Street'],
    'ExitStreetName': [NaN, 'Palm Avenue', 'Mitchell Street'],
    'Path': ['Palm Avenue_NW_Mitchell Street', 'Mitchell Street_SE_Palm Avenue', 'Peachtree Street_NE_Mitchell Street']})

我试图提取

Path

的第一部分，以替换

EntryStreetName

中的NaN值

我设置了以下函数（这里是初学者）：

但是，它会在单元格中返回以下内容：

 <re.Match object; span=(0, 38), match='0      ...

在下划线上拆分字符串不是更简单吗
df['Path'].str.split('_', 1).str[0]

0         Palm Avenue
1     Mitchell Street
2    Peachtree Street
Name: Path, dtype: object

在此之后，使用fillna
作为填充NAN的最后一步
df['EntryStreetName'] = df['EntryStreetName'].fillna(
    df['Path'].str.split('_', 1).str[0]))
df

    EntryStreetName   ExitStreetName                                 Path
0       Palm Avenue              NaN       Palm Avenue_NW_Mitchell Street
1   Mitchell Street      Palm Avenue       Mitchell Street_SE_Palm Avenue
2  Peachtree Street  Mitchell Street  Peachtree Street_NE_Mitchell Street

在下划线上拆分字符串不是更简单吗
df['Path'].str.split('_', 1).str[0]

0         Palm Avenue
1     Mitchell Street
2    Peachtree Street
Name: Path, dtype: object

在此之后，使用fillna
作为填充NAN的最后一步
df['EntryStreetName'] = df['EntryStreetName'].fillna(
    df['Path'].str.split('_', 1).str[0]))
df

    EntryStreetName   ExitStreetName                                 Path
0       Palm Avenue              NaN       Palm Avenue_NW_Mitchell Street
1   Mitchell Street      Palm Avenue       Mitchell Street_SE_Palm Avenue
2  Peachtree Street  Mitchell Street  Peachtree Street_NE_Mitchell Street

你得到了一份工作。它有一些方法，您可以调用这些方法来获取所需的部分
签出.group
，它返回一个捕获组。在正则表达式中，整个匹配始终是group0
，使用（）
定义的单个捕获组随后是group1
、2
等
因此，您可以使用.group（0）
：
row['EntryStreetName']=re.match（'[^\u]*'，row['Path']）组（0）
您将获得一份工作。它有一些方法，您可以调用这些方法来获取所需的部分
签出.group
，它返回一个捕获组。在正则表达式中，整个匹配始终是group0
，使用（）
定义的单个捕获组随后是group1
、2
等
因此，您可以使用.group（0）
：
row['EntryStreetName']=re.match（'[^\u]*'，row['Path']）组（0）