Regex 为数学表达式选择列_Regex_Pandas

Regex 为数学表达式选择列

regex pandas

Regex 为数学表达式选择列,regex,pandas,Regex,Pandas,我正在处理这些数据： url1 = 'https://raw.githubusercontent.com/dothemathonthatone/maps/master/bev_parse.csv' df_bev = pd.read_csv(url1, index_col=0) url2 = 'https://raw.githubusercontent.com/dothemathonthatone/maps/master/frucht_parse.csv' df_frucht = pd.read

我正在处理这些数据：

url1 = 'https://raw.githubusercontent.com/dothemathonthatone/maps/master/bev_parse.csv'
df_bev = pd.read_csv(url1, index_col=0)

url2 = 'https://raw.githubusercontent.com/dothemathonthatone/maps/master/frucht_parse.csv'
df_frucht = pd.read_csv(url2, index_col=0)

fertility = pd.merge(df_bev, df_frucht, on=['regional_schlüssel', 'year'])

现在我想用正则表达式过滤列并将它们分开；使用下面关于另一个Stackoverflow问题的数据。这部分很好

denominator = fertility.filter(regex='^[a-z]+\d+$')                   # Lower case ones
denominator = denominator.apply(pd.to_numeric, errors='coerce')

这部分不太好用。使用此代码，我尝试选择格式为Aus_15-20、Deu15_20、…、Aus45_50、Deu45_50的列

numerator = fertility.filter(regex='^[A-Z][a-z]+\d+$')
numerator = numerator.apply(pd.to_numeric, errors='coerce')  # Fix numbers

出于某种原因，我没有选择

分子中的列
最后一个暗号是
numerator.div(denominator).add_prefix('fertility_')

fertility = fertility[['year', 'regional_schlüssel']].join(numerator.div(denominator).add_prefix('fertility_'))

fertility['reg_schl'] = fertility['regional_schlüssel']

我不熟悉使用regex和I.loc；我想更好地使用它们。谁能告诉我出了什么问题。在我更改了数据集中的列之前，代码运行得很好，现在我无法将其取回。
您可以使用
^[A-Z][a-z]+_?\d+[-_]\d+$

请参阅和：

详细信息

^
-字符串的开头
[A-Z]
-大写ASCII字母
[a-z]+
-一个或多个小写ASCII字母
。
-可选（1或0）。
字符
\d+
-一个或多个数字
[-\]
-a-
或
\d+
-一个或多个数字
$
-字符串结束
试试^[A-Z][A-Z]+\d+[-\d+$
@WiktorStribiżew:太好了！它起作用了！谢谢什么是“？”顺便问一下？你使用了什么工具或网站来生成该图形+1这真是太酷了。@ScottBoston网站的链接在答案中。