Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 拆分遇到第一个数字的数据帧单元_Python_Regex_Pandas_Dataframe - Fatal编程技术网

Python 拆分遇到第一个数字的数据帧单元

Python 拆分遇到第一个数字的数据帧单元,python,regex,pandas,dataframe,Python,Regex,Pandas,Dataframe,我的dataframe中有一列,我想在遇到第一个数值的地方拆分它。以下是我的数据示例: col 1 Beb il Gisire, contrata 102 12 Bungemma, territorium 90, 115, 130 13 Terr

我的dataframe中有一列,我想在遇到第一个数值的地方拆分它。以下是我的数据示例:

                                                    col
1                           Beb il Gisire, contrata 102
12                    Bungemma, territorium 90, 115, 130
13                               Territorium Binhise 188
14                                Contrata Bir Bahar 205
15                                Contrata Bir HaJar 168
16                                 Bir Kibir, contrata 7
17      Lu Burgu; Suburbium Castri Maris 5, 15, 23, 6...
我不能按空格或数字分割,因为它们会发生变化。所需输出为:

    1                           Beb il Gisire, contrata           102
    12                          Bungemma, territorium             90, 115, 130
    13                          Territorium Binhise               188
    14                          Contrata Bir Bahar                205
    15                          Contrata Bir HaJar                168
    16                          Bir Kibir, contrata               7
    17                          Lu Burgu; Suburbium Castri Maris  5, 15, 23, 6...
一种选择是:

df['col1'] = df['col'].str.split('(\d)').str[0]
df['col2'] = df['col'].replace(to_replace=r'\b'+df['col1']+r'\b', value='',regex=True)
输出:

                               col1              col2  
0           Beb il Gisire, contrata               102  
1             Bungemma, territorium      90, 115, 130  
2               Territorium Binhise               188  
3                Contrata Bir Bahar               205  
4                Contrata Bir HaJar               168  
5               Bir Kibir, contrata                 7  
6  Lu Burgu; Suburbium Castri Maris   5, 15, 23, 6...
.

一个选项是:

df['col1'] = df['col'].str.split('(\d)').str[0]
df['col2'] = df['col'].replace(to_replace=r'\b'+df['col1']+r'\b', value='',regex=True)
输出:

                               col1              col2  
0           Beb il Gisire, contrata               102  
1             Bungemma, territorium      90, 115, 130  
2               Territorium Binhise               188  
3                Contrata Bir Bahar               205  
4                Contrata Bir HaJar               168  
5               Bir Kibir, contrata                 7  
6  Lu Burgu; Suburbium Castri Maris   5, 15, 23, 6...
.

使用“.*?\d.”正则表达式模式捕获/拆分组

In [237]: df.col.str.extract('(.*?)(\d.*)')
Out[237]:
                                   0                 1
1            Beb il Gisire, contrata               102
12             Bungemma, territorium      90, 115, 130
13               Territorium Binhise               188
14                Contrata Bir Bahar               205
15                Contrata Bir HaJar               168
16               Bir Kibir, contrata                 7
17  Lu Burgu; Suburbium Castri Maris   5, 15, 23, 6...
使用“.*?\d.*”正则表达式模式捕获/拆分组

In [237]: df.col.str.extract('(.*?)(\d.*)')
Out[237]:
                                   0                 1
1            Beb il Gisire, contrata               102
12             Bungemma, territorium      90, 115, 130
13               Territorium Binhise               188
14                Contrata Bir Bahar               205
15                Contrata Bir HaJar               168
16               Bir Kibir, contrata                 7
17  Lu Burgu; Suburbium Castri Maris   5, 15, 23, 6...

你的期望输出是怎样的?你的期望输出是怎样的?