Python 如果出现错误,则执行其他操作-字符串拆分
我希望在上面的数据框中拆分字符串,如下所示Python 如果出现错误,则执行其他操作-字符串拆分,python,pandas,dataframe,Python,Pandas,Dataframe,我希望在上面的数据框中拆分字符串,如下所示 df = pd.DataFrame(['BERGEPAINT20FEB550PE', 'BANKNIFTY2020631300CE', 'BANKNIFTY2020631300PE'], columns=list('A')) df['StrikePrice'] = df.A.str.split('(\d+)').apply(lambda x: x[3]) df['CallPut'] = df.A.str[-2:] print(df.head())
df = pd.DataFrame(['BERGEPAINT20FEB550PE', 'BANKNIFTY2020631300CE', 'BANKNIFTY2020631300PE'], columns=list('A'))
df['StrikePrice'] = df.A.str.split('(\d+)').apply(lambda x: x[3])
df['CallPut'] = df.A.str[-2:]
print(df.head())
但是如果得到一个错误请使用regex'或'表达式对给定的数据执行此操作。按5位数或2位数拆分:
BERGEPAINT20FEB550PE -> BERGEPAINT, 550, PE
BANKNIFTY2020631300CE -> BANKNIFTY, 31300, CE
BANKNIFTY2020631300PE -> BANKNIFTY, 31300, PE
输出:
df = pd.DataFrame(['BERGEPAINT20FEB550PE', 'BANKNIFTY2020631300CE', 'BANKNIFTY2020631300PE'], columns=list('A'))
df['StrikePrice'] = df.A.str.split('(\d{5}|\d{2})').str[-2]
df['CallPut'] = df.A.str[-2:]
df['Name'] = df.A.str.split('(\d+)').str[0]
print(df.head())
假设您不想要的部件20FEB、20206、20206都从20FEB开始,并且由5个字符组成,那么您可以使用:
A StrikePrice CallPut Name
0 BERGEPAINT20FEB550PE 55 PE BERGEPAINT
1 BANKNIFTY2020631300CE 31300 CE BANKNIFTY
2 BANKNIFTY2020631300PE 31300 PE BANKNIFTY
输出:
df = pd.DataFrame(['BERGEPAINT20FEB550PE', 'BANKNIFTY2020631300CE', 'BANKNIFTY2020631300PE'], columns=list('A'))
df['Toto'] = df.A.apply(lambda x: x[:x.index("20")])
df['StrikePrice'] = df.A.apply(lambda x: x[x.index("20")+5:-2])
df['CallPut'] = df.A.str[-2:]
print(df)
也许这就是你想要的:
A Toto StrikePrice CallPut
0 BERGEPAINT20FEB550PE BERGEPAINT 550 PE
1 BANKNIFTY2020631300CE BANKNIFTY 31300 CE
2 BANKNIFTY2020631300PE BANKNIFTY 31300 PE
您在这样的任意数字处进行拆分,并尝试通过x[3]访问第四个元素。但是BANKNIFTY2020631300CE中没有4个元素,因为它像[BANKNIFTY,2020631300,CE]一样分裂
s = df['A'].str.split('(\d+)').apply(lambda x: [x[0], x[-2][-5:], x[-1]])
s.apply(lambda x: pd.Series(x)).rename(columns={0: 'A', 1: 'StrikePrice', 2: 'CallPut'})
A StrikePrice CallPut
0 BERGEPAINT 550 PE
1 BANKNIFTY 31300 CE
2 BANKNIFTY 31300 PE