Python str.match不完全匹配,因为没有考虑后面的字符

Python str.match不完全匹配,因为没有考虑后面的字符,python,pandas,csv,Python,Pandas,Csv,我有一个CSV文件: State, Region AK, Pacific Non Continuous HI, Pacific Non Continuous AL, East South Central AZ, Mountain CA, Pacific OR, Pacific 当我跑步时: df = pd.r

我有一个CSV文件:

State,  Region                  
AK,     Pacific Non Continuous
HI,     Pacific Non Continuous 
AL,     East South Central  
AZ,     Mountain                
CA,     Pacific                
OR,     Pacific                
当我跑步时:

df = pd.read_csv('C:...\input.csv')

df['SuperRegion'] = pd.np.where(df.Region.str.match("New England|Middle Atlantic|South Atlantic"), "East",
                pd.np.where(df.Region.str.match("East North Central|East South Central|West North Central|West South Central"), "Mid West",
                pd.np.where(df.Region.str.match("Mountain|Pacific"), "West", "Other")))

df.to_csv('C:...\Output.csv', index=False)
我希望前两行的
SuperRegion
值为
Other

State,  Region,                  SuperRegion
AK,     Pacific Non Continuous,  **Other**
HI,     Pacific Non Continuous,  **Other**
AL,     East South Central,      Mid West
AZ,     Mountain,                West
CA,     Pacific,                 West
OR,     Pacific,                 West
但我得到的却是:

State,  Region,                  SuperRegion
AK,     Pacific Non Continuous,  **West**
HI,     Pacific Non Continuous,  **West**
AL,     East South Central,      Mid West
AZ,     Mountain,                West
CA,     Pacific,                 West
OR,     Pacific,                 West
我假设当它运行时,它不会像我希望的那样区分
Pacific
Pacific Non-Continuous
。有什么建议吗

为什么不改变:

pd.np.where(df.Region.str.match("Mountain|Pacific"), "West", "Other")))
致:

或单独添加案例:

df['SuperRegion'] = pd.np.where(df.Region.str.match("New England|Middle Atlantic|South Atlantic"), "East",
                pd.np.where(df.Region.str.match("East North Central|East South Central|West North Central|West South Central"), "Mid West",
                pd.np.where(df.Region.str.match("Pacific Non Continuous"), "Other",
                pd.np.where(df.Region.str.match("Mountain|Pacific"), "West")))
理想的解决方案是创建一个字典,其中键作为区域,值作为超区域,并使用

df['Regions'].map(dict)
您可以使用isin()

你得到

    State   Region                  SuperRegion
0   AK      Pacific Non Continuous  Other
1   HI      Pacific Non Continuous  Other
2   AL      East South Central      Mid West
3   AZ      Mountain                West
4   CA      Pacific                 West
5   OR      Pacific                 West
正如你上面提到的,我在比赛之前将比赛添加到,效果非常好!谢谢我很惊讶没有一个精确的匹配命令。
df['SuperRegion'] = np.where(df.Region.isin(['New England','Middle Atlantic','South Atlantic']), "East",\
np.where(df.Region.isin(["East North Central","East South Central","West North Central","West South Central"]), "Mid West",\
np.where(df.Region.isin(["Mountain","Pacific"]), "West", "Other")))
    State   Region                  SuperRegion
0   AK      Pacific Non Continuous  Other
1   HI      Pacific Non Continuous  Other
2   AL      East South Central      Mid West
3   AZ      Mountain                West
4   CA      Pacific                 West
5   OR      Pacific                 West