Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 正则表达式来提取一组单词_Python_Regex - Fatal编程技术网

Python 正则表达式来提取一组单词

Python 正则表达式来提取一组单词,python,regex,Python,Regex,我想为下表中的每一行提取Description列中的字符串。由于搜索sting包含空格,并且列由空格分隔,因此我不确定如何解析每行中的正确字段 Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description ------- ------------ ------ ------------ ----------- -----

我想为下表中的每一行提取Description列中的字符串。由于搜索sting包含空格,并且列由空格分隔,因此我不确定如何解析每行中的正确字段

    Name     PCI Device    Driver  Admin Status  Link Status  Speed  Duplex  MAC Address         MTU  Description
-------  ------------  ------  ------------  -----------  -----  ------  -----------------  ----  ----------------------------------------------------------------
vmnic0   0000:3d:00.0  i40en   Up            Down             0  Half    00:00:00:00:03:14  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic1   0000:3d:00.1  i40en   Up            Down             0  Half    00:00:00:00:03:15  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic10  0000:d9:00.1  ixgben  Up            Down             0  Half    a0:36:9f:d9:b9:11  1500  Intel(R) Ethernet Controller 10G X550
vmnic11  0000:01:00.0  i40en   Up            Down             0  Half    3c:fd:fe:a9:4e:b8  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic12  0000:01:00.1  i40en   Up            Up           10000  Full    3c:fd:fe:a9:4e:b9  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic2   0000:00:1f.6  ne1000  Up            Down             0  Half    88:88:88:88:87:88  1500  Intel Corporation Ethernet Connection (3) I219-LM
vmnic3   0000:3d:00.2  i40en   Up            Down             0  Half    00:00:00:00:03:16  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic4   0000:3d:00.3  i40en   Up            Down             0  Half    00:00:00:00:03:17  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic5   0000:18:00.0  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a8  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic6   0000:18:00.1  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a9  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic7   0000:81:00.0  ixgben  Up            Up           10000  Full    90:e2:ba:1e:b6:24  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic8   0000:81:00.1  ixgben  Up            Down             0  Half    90:e2:ba:1e:b6:25  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic9   0000:d9:00.0  ixgben  Up            Up            1000  Full    a0:36:9f:d9:b9:10  1500  Intel(R) Ethernet Controller 10G X550

您的分隔符似乎是“多个空格”。它的正则表达式是
\s{2,}

因此,对于这里的每一行,
description=re.split('\s{2,}',line)[-1]
使用
pandas

from io import StringIO
import pandas as pd

TESTDATA = StringIO("""
        Name     PCI Device    Driver  Admin Status  Link Status  Speed  Duplex  MAC Address         MTU  Description
-------  ------------  ------  ------------  -----------  -----  ------  -----------------  ----  ----------------------------------------------------------------
vmnic0   0000:3d:00.0  i40en   Up            Down             0  Half    00:00:00:00:03:14  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic1   0000:3d:00.1  i40en   Up            Down             0  Half    00:00:00:00:03:15  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic10  0000:d9:00.1  ixgben  Up            Down             0  Half    a0:36:9f:d9:b9:11  1500  Intel(R) Ethernet Controller 10G X550
vmnic11  0000:01:00.0  i40en   Up            Down             0  Half    3c:fd:fe:a9:4e:b8  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic12  0000:01:00.1  i40en   Up            Up           10000  Full    3c:fd:fe:a9:4e:b9  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic2   0000:00:1f.6  ne1000  Up            Down             0  Half    88:88:88:88:87:88  1500  Intel Corporation Ethernet Connection (3) I219-LM
vmnic3   0000:3d:00.2  i40en   Up            Down             0  Half    00:00:00:00:03:16  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic4   0000:3d:00.3  i40en   Up            Down             0  Half    00:00:00:00:03:17  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic5   0000:18:00.0  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a8  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic6   0000:18:00.1  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a9  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic7   0000:81:00.0  ixgben  Up            Up           10000  Full    90:e2:ba:1e:b6:24  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic8   0000:81:00.1  ixgben  Up            Down             0  Half    90:e2:ba:1e:b6:25  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic9   0000:d9:00.0  ixgben  Up            Up            1000  Full    a0:36:9f:d9:b9:10  1500  Intel(R) Ethernet Controller 10G X550
    """)

df = pd.read_csv(TESTDATA, sep="\s{2,}").iloc[1:]
descriptions = [x for x in df['Description']]
以及输出:

['Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel(R) Ethernet Controller 10G X550',
 'Intel(R) Ethernet Controller XXV710 for 25GbE SFP28',
 'Intel(R) Ethernet Controller XXV710 for 25GbE SFP28',
 'Intel Corporation Ethernet Connection (3) I219-LM',
 'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel(R) Ethernet Controller 10G X550']

我想你可以把每一行都编成一个字符串

>>s=“vmnic0 0000:3d:00.0 i40en上下0:00:00:00:00:03:14 1500 Intel(R)以太网连接X722,用于10GbE SFP+”
>>>row=re.split(r“\s{2,}”,s)
>>>description=行[-1]

您能解释一下您在Split中使用的模式吗?它包含2个或更多空格。