Python BeatifulSoup使用条件查找结果
如何解析带有条件的代码?我有soap响应,只需要打印包含type=1的组件_idPython BeatifulSoup使用条件查找结果,python,soap,web-scraping,beautifulsoup,Python,Soap,Web Scraping,Beautifulsoup,如何解析带有条件的代码?我有soap响应,只需要打印包含type=1的组件_id <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.or
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetTerritoriesHierarhyResponse
xmlns="http://parsec.ru/Parsec3IntergationService">
<GetTerritoriesHierarhyResult>
<Territory xsi:type="TerritoryWithComponent">
<ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</ID>
<TYPE>0</TYPE>
<NAME>OFFLINE</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
K>0</FEATURE_MASK>
</Territory>
<Territory xsi:type="TerritoryWithComponent">
<ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</ID>
<TYPE>1</TYPE>
<NAME>PREO</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
</GetTerritoriesHierarhyResult>
</GetTerritoriesHierarhyResponse>
</soap:Body>
</soap:Envelope>
回应
13c80b2d-d9d3-47cd-9c11-f80597b61e740OFFLINE88ef0e32-3b6f-467c-a0ec-0733317f675713c80b2d-d9d3-47cd-9c11-f80597b61e740
7d432ebb-6199-44c5-b67b-4671718e6e3c0PREO88ef0e32-3b6f-467c-a0ec-0733317f67577d432ebb-6199-44c5-b67b-4671718e6e3c0
我想我需要像这样的东西
for i in soup.find_all('Territory'):
if type = 1 print component_id
可以使用
if
语句按照您的建议进行迭代和检查:
给定:
xml = '''<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetTerritoriesHierarhyResponse
xmlns="http://parsec.ru/Parsec3IntergationService">
<GetTerritoriesHierarhyResult>
<Territory xsi:type="TerritoryWithComponent">
<ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</ID>
<TYPE>0</TYPE>
<NAME>OFFLINE</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
K>0</FEATURE_MASK>
</Territory>
<Territory xsi:type="TerritoryWithComponent">
<ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</ID>
<TYPE>1</TYPE>
<NAME>PREO</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
</GetTerritoriesHierarhyResult>
</GetTerritoriesHierarhyResponse>
</soap:Body>
</soap:Envelope>'''
from bs4 import BeautifulSoup
xml = response.content
soup = BeautifulSoup(xml, 'xml')
for i in soup.find_all('Territory'):
if i.select('TYPE')[0].text == '1':
print (i.select('COMPONENT_ID')[0].text)
7d432ebb-6199-44c5-b67b-4671718e6e3c
或
输出:
xml = '''<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetTerritoriesHierarhyResponse
xmlns="http://parsec.ru/Parsec3IntergationService">
<GetTerritoriesHierarhyResult>
<Territory xsi:type="TerritoryWithComponent">
<ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</ID>
<TYPE>0</TYPE>
<NAME>OFFLINE</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
K>0</FEATURE_MASK>
</Territory>
<Territory xsi:type="TerritoryWithComponent">
<ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</ID>
<TYPE>1</TYPE>
<NAME>PREO</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
</GetTerritoriesHierarhyResult>
</GetTerritoriesHierarhyResponse>
</soap:Body>
</soap:Envelope>'''
from bs4 import BeautifulSoup
xml = response.content
soup = BeautifulSoup(xml, 'xml')
for i in soup.find_all('Territory'):
if i.select('TYPE')[0].text == '1':
print (i.select('COMPONENT_ID')[0].text)
7d432ebb-6199-44c5-b67b-4671718e6e3c
可以使用
if
语句按照您的建议进行迭代和检查:
给定:
xml = '''<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetTerritoriesHierarhyResponse
xmlns="http://parsec.ru/Parsec3IntergationService">
<GetTerritoriesHierarhyResult>
<Territory xsi:type="TerritoryWithComponent">
<ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</ID>
<TYPE>0</TYPE>
<NAME>OFFLINE</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
K>0</FEATURE_MASK>
</Territory>
<Territory xsi:type="TerritoryWithComponent">
<ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</ID>
<TYPE>1</TYPE>
<NAME>PREO</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
</GetTerritoriesHierarhyResult>
</GetTerritoriesHierarhyResponse>
</soap:Body>
</soap:Envelope>'''
from bs4 import BeautifulSoup
xml = response.content
soup = BeautifulSoup(xml, 'xml')
for i in soup.find_all('Territory'):
if i.select('TYPE')[0].text == '1':
print (i.select('COMPONENT_ID')[0].text)
7d432ebb-6199-44c5-b67b-4671718e6e3c
或
输出:
xml = '''<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetTerritoriesHierarhyResponse
xmlns="http://parsec.ru/Parsec3IntergationService">
<GetTerritoriesHierarhyResult>
<Territory xsi:type="TerritoryWithComponent">
<ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</ID>
<TYPE>0</TYPE>
<NAME>OFFLINE</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>13c80b2d-d9d3-47cd-9c11-f80597b61e74</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
K>0</FEATURE_MASK>
</Territory>
<Territory xsi:type="TerritoryWithComponent">
<ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</ID>
<TYPE>1</TYPE>
<NAME>PREO</NAME>
<PARENT_ID>88ef0e32-3b6f-467c-a0ec-0733317f6757</PARENT_ID>
<COMPONENT_ID>7d432ebb-6199-44c5-b67b-4671718e6e3c</COMPONENT_ID>
<FEATURE_MASK>0</FEATURE_MASK>
</Territory>
</GetTerritoriesHierarhyResult>
</GetTerritoriesHierarhyResponse>
</soap:Body>
</soap:Envelope>'''
from bs4 import BeautifulSoup
xml = response.content
soup = BeautifulSoup(xml, 'xml')
for i in soup.find_all('Territory'):
if i.select('TYPE')[0].text == '1':
print (i.select('COMPONENT_ID')[0].text)
7d432ebb-6199-44c5-b67b-4671718e6e3c
您可以使用bs4 4.7.1中的
:has
和:contains
伪类来测试是否存在。如果没有,您将得到一个空列表。为解析器指定xml
。我认为读起来更简洁
线路:
soup = bs(xml, 'xml')
results = [item.select_one('COMPONENT_ID').text for item in soup.select('Territory:has(TYPE:contains("1"))')]
print(results)
全部:
从bs4导入美化组作为bs
xml=“”
13c80b2d-d9d3-47cd-9c11-f80597b61e74
0
离线
88ef0e32-3b6f-467c-a0ec-0733317f6757
13c80b2d-d9d3-47cd-9c11-f80597b61e74
0
K> 0
7d432ebb-6199-44c5-b67b-4671718e6e3c
1.
前
88ef0e32-3b6f-467c-a0ec-0733317f6757
7d432ebb-6199-44c5-b67b-4671718e6e3c
0
'''
soup=bs(xml,'xml')
结果=[item.select_one('COMPONENT_ID')。汤中项目的文本。select('Territory:has(TYPE:contains(“1”)))]
打印(结果)
您可以使用带有bs4.7.1的:has
和:contains
伪类来测试是否存在。如果没有,您将得到一个空列表。为解析器指定xml
。我认为读起来更简洁
线路:
soup = bs(xml, 'xml')
results = [item.select_one('COMPONENT_ID').text for item in soup.select('Territory:has(TYPE:contains("1"))')]
print(results)
全部:
从bs4导入美化组作为bs
xml=“”
13c80b2d-d9d3-47cd-9c11-f80597b61e74
0
离线
88ef0e32-3b6f-467c-a0ec-0733317f6757
13c80b2d-d9d3-47cd-9c11-f80597b61e74
0
K> 0
7d432ebb-6199-44c5-b67b-4671718e6e3c
1.
前
88ef0e32-3b6f-467c-a0ec-0733317f6757
7d432ebb-6199-44c5-b67b-4671718e6e3c
0
'''
soup=bs(xml,'xml')
结果=[item.select_one('COMPONENT_ID')。汤中项目的文本。select('Territory:has(TYPE:contains(“1”)))]
打印(结果)