C++ IE探索11<;c++;用于替换DOM中文本的ATL COM浏览器辅助对象(加载项)

C++ IE探索11<;c++;用于替换DOM中文本的ATL COM浏览器辅助对象(加载项),c++,internet-explorer,com,atl,bho,C++,Internet Explorer,Com,Atl,Bho,我试图使用BHO从IE11的Dom中删除一行Javascript。(Internet Explorer加载项) 这是如此糟糕的记录,很难看到最好的前进之路 IE设法在C++ ATL/COM中编写了BHO,并且它的工作很好,但是我不能很好地找出从正文中删除/替换文本的最佳方法,然后将更改注入到页面中。 老实说,我还没来得及读这本1000页的过时的COM书籍:-) 这是我目前为OnDocumentComplete活动准备的内容: void STDMETHODCALLTYPE CMyFooBHO::

我试图使用BHO从IE11的Dom中删除一行Javascript。(Internet Explorer加载项)

这是如此糟糕的记录,很难看到最好的前进之路

IE设法在C++ ATL/COM中编写了BHO,并且它的工作很好,但是我不能很好地找出从正文中删除/替换文本的最佳方法,然后将更改注入到页面中。 老实说,我还没来得及读这本1000页的过时的COM书籍:-)

这是我目前为OnDocumentComplete活动准备的内容:

void STDMETHODCALLTYPE CMyFooBHO::OnDocumentComplete(IDispatch *pDisp, VARIANT *pvarURL)
{
    BSTR bstrURL = pvarURL->bstrVal;

    if (_wcsicmp(bstrURL, ABOUT_BLANK) == 0)
    {
        return;
    }

    HRESULT hr = S_OK;

    // Query for the IWebBrowser2 interface.
    CComQIPtr<IWebBrowser2> spTempWebBrowser = pDisp;

    // Is this event associated with the top-level browser?
    if (spTempWebBrowser && m_spWebBrowser && m_spWebBrowser.IsEqualObject(spTempWebBrowser))
    {
        // Get the current document object from browser.
        CComPtr<IDispatch> spDispDoc;
        hr = m_spWebBrowser->get_Document(&spDispDoc);

        if (SUCCEEDED(hr))
        {
            // Verify that what we get is a pointer to a IHTMLDocument2 interface. 
            // To be sure, let's query for the IHTMLDocument2 interface (through smart pointers).

            CComQIPtr<IHTMLDocument2, &IID_IHTMLDocument2> spHTML;
            spHTML = spDispDoc;

            // Extract the source of the document if its HTML.
            if (spHTML)
            {
                // Get the BODY object.
                CComPtr<IHTMLElement> m_pBody;
                hr = spHTML->get_body(&m_pBody);

                if (SUCCEEDED(hr))
                {
                    // Get the HTML text.
                    BSTR bstrHTMLText;
                    hr = m_pBody->get_outerHTML(&bstrHTMLText);

                    if (SUCCEEDED(hr))
                    {
                        // bstrHTMLText now contains the <body> ...whatever... </body> of the html page.

                        // ******** HERE ********

                        // What I want to do here is replace some text contained in bstrHTMLText  
                        // i.e. Replace "ABC" with "DEF" if it exists in bstrHTMLText.

                        // Then replace the body of the original page with the edited bstrHTMLText.

                        // My actual goal is to remove one line of javascript.

                    }
                }
            }
        }
    }
}
void STDMETHODCALLTYPE CMyFooBHO::OnDocumentComplete(IDispatch*pDisp,VARIANT*pvarURL)
{
BSTR bstrURL=pvarURL->bstrVal;
if(_wcsicmp(bstrURL,约为_BLANK)==0)
{
返回;
}
HRESULT hr=S_正常;
//查询IWebBrowser2接口。
CComQIPtr spTempWebBrowser=pDisp;
//此事件是否与顶级浏览器关联?
if(spTempWebBrowser和&m_spWebBrowser和&m_spWebBrowser.IsEqualObject(spTempWebBrowser))
{
//从浏览器中获取当前文档对象。
CComPtr spDispDoc;
hr=m_spWebBrowser->get_Document(&spDispDoc);
如果(成功(hr))
{
//验证我们得到的是指向IHTMLDocument2接口的指针。
//当然,让我们(通过智能指针)查询IHTMLDocument2接口。
CCOMKIPTR spHTML;
spHTML=spDispDoc;
//提取文档的源代码(如果是HTML)。
if(spHTML)
{
//获取身体对象。
货币发行人;
hr=spHTML->get_body(&m_pBody);
如果(成功(hr))
{
//获取HTML文本。
BSTR bstrHTMLText;
hr=m_pBody->get_outerHTML(&bstrHTMLText);
如果(成功(hr))
{
//bstrHTMLText现在包含html页面的…无论如何。
//**********这里********
//这里我要做的是替换BSTRHTMLEXT中包含的一些文本
//即,如果BSTRHTMLEXT中存在“ABC”,则将其替换为“DEF”。
//然后将原始页面的正文替换为已编辑的bstrHTMLText。
//我的实际目标是删除一行javascript。
}
}
}
}
}
}

请随意评论对现有代码的任何改进。

这里是另一种方法,使用JavaScript

var oCollection = document.getElementsByTagName("script");
var nColCount = oCollection.length;
var nIndex;
for ( nIndex = 0; nIndex < nColCount; ++nIndex ) {
    var oScript = oCollection[ nIndex ];
    var strScriptText = oScript.innerHTML;
    if ( strScriptText.indexOf( "alert(\"hello\");" ) != -1 ) {
        var strNewText = strScriptText.replace( "alert(\"hello\");", "" );
        var oNewScript = document.createElement("script");
        oNewScript.type = "text\/javascript";
        oNewScript.text = strNewText;
        document.getElementsByTagName("head")[0].appendChild(oNewScript);
        console.log ("DONE!");
    }
}

这不符合正常的(应该做的)方式

如果没有更好的答案,那么我想这是最好的答案,我会这样做

我希望听到任何意见或更新,以改善,或给我一个工作的例子,这是更好的

这是IE 11,在Visual Studio 2015中使用C++ ATL/COM编译。 我已经尝试过迭代DOM并对其进行更改,以及其他所有记录得非常糟糕的变体

在读取html时,似乎从来没有出现过问题,即在中获取\u innerText获取\u innerHTML获取\u outerHTML 它的形式多种多样,但它似乎从来没有发挥过主要作用。为什么?似乎没有人能告诉我,也没有人能告诉我 这是一个可行的例子

我发现get_body>get_innerHTML>put_innerHTML似乎确实有效

因此,为了找到这一点,我只编写了一个函数来在CComBSTR中搜索和替换

这对我来说是可行的,但我想您可以将返回的内容作为主体内部HTML并运行其他一些 如果您的需求不同,则在其上添加DOM操作代码(而不是内置代码)

这种方法的主要优点是不依赖于c**p未记录的代码 用某种神秘的方法当MS想要的时候


这是测试html页面。我正在尝试删除“警报(“你好”)”即 当页面完成加载时执行

<!doctype html>

  <head>
    <title>Site</title>

    <meta http-equiv="cache-control" content="max-age=0" />
    <meta http-equiv="cache-control" content="no-cache" />
    <meta http-equiv="expires" content="0" />
    <meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
    <meta http-equiv="pragma" content="no-cache" />

  </head>

  <body>

    <div>If a dialog with hello appears then the BHO failed</div>


    <script type="text/javascript">

      window.onload = function(){
        window.document.body.onload = foo; 
      };

      function foo()
      {
          alert("hello");
      }

    </script>

  </body>
<html>

场地
如果出现带有hello的对话框,则BHO失败
window.onload=函数(){
window.document.body.onload=foo;
};
函数foo()
{
警惕(“你好”);
}

//FooBHO.h:CFooBHO的声明
#布拉格语一次
#包括“resource.h”//main符号
#包括“FooIEAddOn_i.h”
#包括//IID_IWebBrowser2、DIID_DWebBrowserEvents 2等。
#包括//DISPID\u DOCUMENTCOMPLETE等。
#包含//DOM接口
#包括
#如果已定义(_WIN32_WCE)&&!已定义(\u CE\u DCOM)&!已定义(\u CE\u允许\u单线程\u对象\u在\u MTA中)
#错误“Windows CE平台不正确支持单线程COM对象,例如不包括完全DCOM支持的Windows Mobile平台。在MTA中定义\u CE\u允许\u单线程\u对象\u强制ATL支持创建单线程COM对象,并允许使用其单线程COM对象实现。rgs文件中的线程模型已设置为“自由”,因为这是非DCOM Windows CE平台支持的唯一线程模型。”
#恩迪夫
#定义DISPID_DOCUMENTRELOAD 282
使用名称空间ATL;
使用名称空间std;
//CFooBHO
类别ATL_NO_VTABLE CFooBHO:公共CComObjectRootEx,
公共课程,
具有SiteImpl的公共对象,
公共场所,
公共IDispenTempl
{
公众:
CFooBHO()
{
}
//STDMETHOD宏是一种ATL约定,它将该方法标记为虚拟方法,并确保该方法具有正确的公共调用约定
//COM接口。它有助于将COM接口与可能存在的其他公共方法区分开来
<!doctype html>

  <head>
    <title>Site</title>

    <meta http-equiv="cache-control" content="max-age=0" />
    <meta http-equiv="cache-control" content="no-cache" />
    <meta http-equiv="expires" content="0" />
    <meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
    <meta http-equiv="pragma" content="no-cache" />

  </head>

  <body>

    <div>If a dialog with hello appears then the BHO failed</div>


    <script type="text/javascript">

      window.onload = function(){
        window.document.body.onload = foo; 
      };

      function foo()
      {
          alert("hello");
      }

    </script>

  </body>
<html>
// FooBHO.h : Declaration of the CFooBHO

#pragma once
#include "resource.h"       // main symbols

#include "FooIEAddOn_i.h"

#include <shlguid.h>        // IID_IWebBrowser2, DIID_DWebBrowserEvents2, etc.

#include <exdispid.h>       // DISPID_DOCUMENTCOMPLETE, etc.

#include <mshtml.h>         // DOM interfaces

#include <string> 

#if defined(_WIN32_WCE) && !defined(_CE_DCOM) && !defined(_CE_ALLOW_SINGLE_THREADED_OBJECTS_IN_MTA)
#error "Single-threaded COM objects are not properly supported on Windows CE platform, such as the Windows Mobile platforms that do not include full DCOM support. Define _CE_ALLOW_SINGLE_THREADED_OBJECTS_IN_MTA to force ATL to support creating single-thread COM object's and allow use of it's single-threaded COM object implementations. The threading model in your rgs file was set to 'Free' as that is the only threading model supported in non DCOM Windows CE platforms."
#endif

#define DISPID_DOCUMENTRELOAD 282

using namespace ATL;

using namespace std;

// CFooBHO
class ATL_NO_VTABLE CFooBHO : public CComObjectRootEx<CComSingleThreadModel>,
                                        public CComCoClass<CFooBHO, &CLSID_FooBHO>,
                                        public IObjectWithSiteImpl<CFooBHO>,
                                        public IDispatchImpl<IFooBHO, &IID_IFooBHO, &LIBID_FooIEAddOnLib, /*wMajor =*/ 1, /*wMinor =*/ 0>,
                                        public IDispEventImpl<1, CFooBHO, &DIID_DWebBrowserEvents2, &LIBID_SHDocVw, 1, 1>
{
    public:

        CFooBHO()
        {
        }

        // The STDMETHOD macro is an ATL convention that marks the method as virtual and ensures that it has the right calling convention for the public
        // COM interface.It helps to demarcate COM interfaces from other public methods that may exist on the class.The STDMETHODIMP macro is likewise used
        // when implementing the member method.

        STDMETHOD(SetSite)(IUnknown *pUnkSite);

        DECLARE_REGISTRY_RESOURCEID(IDR_FooBHO)

        DECLARE_NOT_AGGREGATABLE(CFooBHO)

        BEGIN_COM_MAP(CFooBHO)
            COM_INTERFACE_ENTRY(IFooBHO)
            COM_INTERFACE_ENTRY(IDispatch)
            COM_INTERFACE_ENTRY(IObjectWithSite)
        END_COM_MAP()

        DECLARE_PROTECT_FINAL_CONSTRUCT()

        BEGIN_SINK_MAP(CFooBHO)
            SINK_ENTRY_EX(1, DIID_DWebBrowserEvents2, DISPID_DOCUMENTCOMPLETE, OnDocumentComplete)
        END_SINK_MAP()

        void STDMETHODCALLTYPE OnDocumentComplete(IDispatch *pDisp, VARIANT *pvarURL);

        HRESULT FinalConstruct()
        {
            return S_OK;
        }

        void FinalRelease()
        {
        }

    private:

        CComPtr<IWebBrowser2>  m_spWebBrowser;

        BOOL m_fAdvised;

        static const wchar_t* ABOUT_BLANK;

        void CFooBHO::ReplaceInCComBSTR(CComBSTR &strInput, const wstring &strOld, const wstring &strNew);
};

OBJECT_ENTRY_AUTO(__uuidof(FooBHO), CFooBHO)
// FooBHO.cpp : Implementation of CFooBHO

#include "stdafx.h"
#include "FooBHO.h"
#include "Strsafe.h"

const wchar_t* CFooBHO::ABOUT_BLANK = L"about:blank";

// The SetSite() method is where the BHO is initialized and where you would perform all the tasks that happen only 
// once. When you navigate to a URL with Internet Explorer, you should wait for a couple of events to make sure the
// required document has been completely downloaded and then initialized. Only at this point can you safely access 
// its content through the exposed object model, if any.

STDMETHODIMP CFooBHO::SetSite(IUnknown* pUnkSite)
{
    if (pUnkSite != NULL)
    {
        // Cache the pointer to IWebBrowser2.
        HRESULT hr = pUnkSite->QueryInterface(IID_IWebBrowser2, (void **)&m_spWebBrowser);

        if (SUCCEEDED(hr))
        {
            // Register to sink events from DWebBrowserEvents2.
            hr = DispEventAdvise(m_spWebBrowser);
            if (SUCCEEDED(hr))
            {
                m_fAdvised = TRUE;
            }
        }
    }
    else
    {
        // Unregister event sink.
        if (m_fAdvised)
        {
            DispEventUnadvise(m_spWebBrowser);
            m_fAdvised = FALSE;
        }

        // Release cached pointers and other resources here.
        m_spWebBrowser.Release();
    }

    // Call base class implementation.
    return IObjectWithSiteImpl<CFooBHO>::SetSite(pUnkSite);
}

void STDMETHODCALLTYPE CFooBHO::OnDocumentComplete(IDispatch *pDisp, VARIANT *pvarURL)
{
    BSTR bstrURL = pvarURL->bstrVal;

    // Test for any specific URL here. 
    // Currently we are ignoring ABOUT:BLANK but allowing everything else.

    if (_wcsicmp(bstrURL, ABOUT_BLANK) == 0)
    {
        return;
    }

    HRESULT hr = S_OK;

    // Query for the IWebBrowser2 interface.
    CComQIPtr<IWebBrowser2> spTempWebBrowser = pDisp;

    // Is this event associated with the top-level browser?
    if (spTempWebBrowser && m_spWebBrowser && m_spWebBrowser.IsEqualObject(spTempWebBrowser))
    {
        // Get the current document object from browser.
        CComPtr<IDispatch> spDispDoc;

        if (SUCCEEDED(m_spWebBrowser->get_Document(&spDispDoc)))
        {
            // Verify that what we get is a pointer to a IHTMLDocument2 interface. 
            // To be sure, let's query for the IHTMLDocument2 interface (through smart pointers).

            CComQIPtr<IHTMLDocument2, &IID_IHTMLDocument2> spHTMLDocument2 = spDispDoc;

            // Extract the source of the document if its HTML.
            if (spHTMLDocument2)
            {
                // Get the BODY object.
                CComPtr<IHTMLElement> spBody;

                if (SUCCEEDED(spHTMLDocument2->get_body(&spBody)))
                {
                    // Get the Body HTML text.
                    CComBSTR bstrBodyHTMLText;

                    if (SUCCEEDED(spBody->get_innerHTML(&bstrBodyHTMLText)))
                    {
                        ReplaceInCComBSTR(bstrBodyHTMLText, L"alert(\"hello\");", L"");

                        spBody->put_innerHTML(bstrBodyHTMLText);
                    }
                }
            }
        }
    }
}

void CFooBHO::ReplaceInCComBSTR(CComBSTR &bstrInput, const wstring &strOld, const wstring &strNew)
{
    wstring strOutput(bstrInput);

    size_t iPos = 0;
    size_t iLpos = 0;

    while ((iPos = strOutput.find(strOld, iLpos)) != string::npos)
    {
        strOutput.replace(iPos, strOld.length(), strNew);
        iLpos = iPos + 1;
    }

    ::SysFreeString(bstrInput.m_str);

    // Find and replace is complete; now update the CComBSTR.
    bstrInput.m_str = ::SysAllocString(strOutput.c_str());
}