将字符串解析为另一个PHP?

将字符串解析为另一个PHP?,php,regex,parsing,simple-html-dom,Php,Regex,Parsing,Simple Html Dom,我正在使用一些有旧IIS6的Web服务,他所说的只是HTML,没有JSON和XML。当我得到HTML时,我需要正确地解析数据。唯一的问题是HTML非常混乱,格式也不正确 这是我使用它的服务 它会像这样返回HTML <html xmlns="http://www.w3.org/1999/xhtml" xmlns:ino="http://namespaces.softwareag.com/tamino/response2" xmlns:xql="http://metalab.unc.

我正在使用一些有旧IIS6的Web服务,他所说的只是HTML,没有JSON和XML。当我得到HTML时,我需要正确地解析数据。唯一的问题是HTML非常混乱,格式也不正确

这是我使用它的服务

它会像这样返回HTML

    <html xmlns="http://www.w3.org/1999/xhtml" xmlns:ino="http://namespaces.softwareag.com/tamino/response2" xmlns:xql="http://metalab.unc.edu/xql/" xmlns:xq="http://namespaces.softwareag.com/tamino/XQuery/result">

<head>
    <META http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
    <title>Firmenname=dedal , suche_nach=-, Rechtsform=, Sitz=, Sitz Gemeinde=, Firmennummer=, language=1, phonetisch=no</title>
</head>

<body>
    <font face="arial" size="2">
      <b>Suche nach Firma: <i>dedal </i></b>
      <br />
      <b>(10 Suchresultate am 03.12.2015 um 08:30) [Stand: 03.12.2015 235/2015]</b>
      <br />Zentraler Firmenindex - Eidgenössisches Amt für das Handelsregister<hr /><b>DEDAL FILMS, Albrecht</b><i> in <a target="_top" href="/info/ger/VS626.htm">Lens</a></i>, Einzelunt., <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1058678&amp;parChnr=CH-626.1.014.253-3&amp;language=1">+</a>, <a target="_blank" href="http://vs.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=6261014253&amp;amt=626&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-150.481.375</a><p />DEDAL TRADING SA in liquidazione<i> in <a target="_top" href="/info/ger/TI501.htm">Mendrisio</a></i>, AG, gelöscht: Publ.Dat.  29.07.2005,
         <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=537570&amp;parChnr=CH-524.3.009.149-2&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5243009149&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-101.054.476</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5243009149&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>DEDALE SA</b><i> in <a target="_top" href="/info/ger/VS626.htm">Chermignon</a></i>, AG, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1139492&amp;parChnr=CH-626.3.014.970-6&amp;language=1">+</a>, <a target="_blank" href="http://vs.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=6263014970&amp;amt=626&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-196.615.628</a><p /><b>Dedale Solutions, Putallaz &amp; Co</b><i> in <a target="_top" href="/info/ger/GE660.htm">Genève</a></i>, Kommanditgesell., <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1049329&amp;parChnr=CH-660.0.412.012-4&amp;language=1">+</a>, <a target="_blank" href="http://ge.ch/hrcintapp/externalCompanyReport.action?companyOfrcId13=CH-660-0412012-4&amp;ofrcLanguage=1">CHE-416.967.677</a><p />Dedalo Promotion Limited Liability Company, Cheyenne, Wyoming USA, succursale di Paradiso<i> in <a target="_top" href="/info/ger/TI501.htm">Paradiso</a></i>, Ausl. ZN, gelöscht: Publ.Dat.  25.05.2010,
         <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=349506&amp;parChnr=CH-514.9.009.263-7&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5149009263&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-104.147.677</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5149009263&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>Dedalo SA</b><i> in <a target="_top" href="/info/ger/TI501.htm">Chiasso</a></i>, AG, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1144906&amp;parChnr=CH-501.3.017.898-0&amp;language=1">+++</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5013017898&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-226.878.749</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5013017898&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>Dedalos R&amp;D</b><i> in <a target="_top" href="/info/ger/TI501.htm">Bellinzona</a></i>, Verein, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=431256&amp;parChnr=CH-500.6.004.353-6&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5006004353&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-104.771.605</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5006004353&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>DEDALUS DIVERS Sagl</b><i> in <a target="_top" href="/info/ger/TI501.htm">Gordola</a></i>, GmbH, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1107221&amp;parChnr=CH-501.4.016.642-1&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5014016642&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-167.108.200</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5014016642&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>Dedalus SA</b><i> in <a target="_top" href="/info/ger/TI501.htm">Breggia</a></i>, AG, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=404462&amp;parChnr=CH-524.3.006.007-5&amp;language=1">+++</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5243006007&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-106.145.979</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5243006007&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>EDIL DEDALO S.A.G.L.</b><i> in <a target="_top" href="/info/ger/TI501.htm">Balerna</a></i>, GmbH, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1150282&amp;parChnr=CH-501.4.017.854-1&amp;language=1">+++</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5014017854&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-232.905.567</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5014017854&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><hr size="5" /></font>
    <script type="text/javascript">
        var _paq = _paq || [];
        _paq.push(['trackPageView']);
        _paq.push(['enableLinkTracking']);

        (function() {
            var u = (("https:" == document.location.protocol) ? "https" : "http") + "://www.e-service.admin.ch/analytics/";
            _paq.push(['setTrackerUrl', u + 'piwik.php']);
            _paq.push(['setSiteId', 4]);
            var d = document,
                g = d.createElement('script'),
                s = d.getElementsByTagName('script')[0];
            g.type = 'text/javascript';
            g.defer = true;
            g.async = true;
            g.src = u + 'piwik.js';
            s.parentNode.insertBefore(g, s);
        })();
    </script>
    <noscript>
        <p>
            <img src="http://www.e-service.admin.ch/analytics/piwik.php?idsite=4" style="border:0;" alt="" />
        </p>
    </noscript>
</body>

</html>
<p>COMPANY NAME</p>
<a class="che" href="LINK CHE">CHE</a>
<a class="pdf" href="PDF LINK">PDF</a>

Firmenname=dedal,suche_nach=-,Rechtsform=,Sitz=,Sitz Gemeinde=,Firmennummer=,language=1,phonetish=no
德达尔

(10上述结果于2015年12月3日上午08:30)[立场:2015年12月3日235/2015]
天顶星金融指数-艾德根·西希斯·阿姆特·费尔·达斯·汉德尔注册公司
德达尔电影公司,艾因策伦特的阿尔布雷希特,

德达尔贸易股份有限公司,位于法兰西,格尔施特:Publ.Dat。29.07.2005, 德国德代尔股份有限公司,德代尔解决方案公司;美国怀俄明州夏延市德达洛推广有限责任公司Kommanditgesell.,

公司法人,澳大利亚圣帕拉迪索。ZN,gelöscht:Publ.Dat。25.05.2010, 德国德达洛股份有限公司;德达罗斯潜水员股份有限公司,

德达罗斯股份有限公司,

德达罗斯股份有限公司,

德达洛股份有限公司,

德达洛股份有限公司,

德达洛股份有限公司,


但是我不需要所有的数据,我使用的是来自

我得到了我可以处理的结果,唯一的问题是解析那个字符串,我需要得到带有如下值的HTML

    <html xmlns="http://www.w3.org/1999/xhtml" xmlns:ino="http://namespaces.softwareag.com/tamino/response2" xmlns:xql="http://metalab.unc.edu/xql/" xmlns:xq="http://namespaces.softwareag.com/tamino/XQuery/result">

<head>
    <META http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
    <title>Firmenname=dedal , suche_nach=-, Rechtsform=, Sitz=, Sitz Gemeinde=, Firmennummer=, language=1, phonetisch=no</title>
</head>

<body>
    <font face="arial" size="2">
      <b>Suche nach Firma: <i>dedal </i></b>
      <br />
      <b>(10 Suchresultate am 03.12.2015 um 08:30) [Stand: 03.12.2015 235/2015]</b>
      <br />Zentraler Firmenindex - Eidgenössisches Amt für das Handelsregister<hr /><b>DEDAL FILMS, Albrecht</b><i> in <a target="_top" href="/info/ger/VS626.htm">Lens</a></i>, Einzelunt., <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1058678&amp;parChnr=CH-626.1.014.253-3&amp;language=1">+</a>, <a target="_blank" href="http://vs.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=6261014253&amp;amt=626&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-150.481.375</a><p />DEDAL TRADING SA in liquidazione<i> in <a target="_top" href="/info/ger/TI501.htm">Mendrisio</a></i>, AG, gelöscht: Publ.Dat.  29.07.2005,
         <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=537570&amp;parChnr=CH-524.3.009.149-2&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5243009149&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-101.054.476</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5243009149&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>DEDALE SA</b><i> in <a target="_top" href="/info/ger/VS626.htm">Chermignon</a></i>, AG, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1139492&amp;parChnr=CH-626.3.014.970-6&amp;language=1">+</a>, <a target="_blank" href="http://vs.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=6263014970&amp;amt=626&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-196.615.628</a><p /><b>Dedale Solutions, Putallaz &amp; Co</b><i> in <a target="_top" href="/info/ger/GE660.htm">Genève</a></i>, Kommanditgesell., <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1049329&amp;parChnr=CH-660.0.412.012-4&amp;language=1">+</a>, <a target="_blank" href="http://ge.ch/hrcintapp/externalCompanyReport.action?companyOfrcId13=CH-660-0412012-4&amp;ofrcLanguage=1">CHE-416.967.677</a><p />Dedalo Promotion Limited Liability Company, Cheyenne, Wyoming USA, succursale di Paradiso<i> in <a target="_top" href="/info/ger/TI501.htm">Paradiso</a></i>, Ausl. ZN, gelöscht: Publ.Dat.  25.05.2010,
         <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=349506&amp;parChnr=CH-514.9.009.263-7&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5149009263&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-104.147.677</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5149009263&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>Dedalo SA</b><i> in <a target="_top" href="/info/ger/TI501.htm">Chiasso</a></i>, AG, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1144906&amp;parChnr=CH-501.3.017.898-0&amp;language=1">+++</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5013017898&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-226.878.749</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5013017898&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>Dedalos R&amp;D</b><i> in <a target="_top" href="/info/ger/TI501.htm">Bellinzona</a></i>, Verein, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=431256&amp;parChnr=CH-500.6.004.353-6&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5006004353&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-104.771.605</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5006004353&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>DEDALUS DIVERS Sagl</b><i> in <a target="_top" href="/info/ger/TI501.htm">Gordola</a></i>, GmbH, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1107221&amp;parChnr=CH-501.4.016.642-1&amp;language=1">+</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5014016642&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-167.108.200</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5014016642&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>Dedalus SA</b><i> in <a target="_top" href="/info/ger/TI501.htm">Breggia</a></i>, AG, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=404462&amp;parChnr=CH-524.3.006.007-5&amp;language=1">+++</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5243006007&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-106.145.979</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5243006007&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><b>EDIL DEDALO S.A.G.L.</b><i> in <a target="_top" href="/info/ger/TI501.htm">Balerna</a></i>, GmbH, <a target="result" href="/WebServices/Zefix/Zefix.asmx/ShowFirm?parId=1150282&amp;parChnr=CH-501.4.017.854-1&amp;language=1">+++</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGHTML?chnr=5014017854&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">CHE-232.905.567</a>, <a target="_blank" href="http://ti.powernet.ch/webservices/inet/HRG/HRG.asmx/getHRGPDF?chnr=5014017854&amp;amt=501&amp;toBeModified=0&amp;validOnly=0&amp;lang=1&amp;sort=0">PDF</a><p /><hr size="5" /></font>
    <script type="text/javascript">
        var _paq = _paq || [];
        _paq.push(['trackPageView']);
        _paq.push(['enableLinkTracking']);

        (function() {
            var u = (("https:" == document.location.protocol) ? "https" : "http") + "://www.e-service.admin.ch/analytics/";
            _paq.push(['setTrackerUrl', u + 'piwik.php']);
            _paq.push(['setSiteId', 4]);
            var d = document,
                g = d.createElement('script'),
                s = d.getElementsByTagName('script')[0];
            g.type = 'text/javascript';
            g.defer = true;
            g.async = true;
            g.src = u + 'piwik.js';
            s.parentNode.insertBefore(g, s);
        })();
    </script>
    <noscript>
        <p>
            <img src="http://www.e-service.admin.ch/analytics/piwik.php?idsite=4" style="border:0;" alt="" />
        </p>
    </noscript>
</body>

</html>
<p>COMPANY NAME</p>
<a class="che" href="LINK CHE">CHE</a>
<a class="pdf" href="PDF LINK">PDF</a>
公司名称


问题是有时没有PDF,我不知道解析什么和如何解析:(

看一看。还有一件事-每当你需要解析HTML时,你应该(几乎)永远不要想到
regex
。请展示一些代码。