C# 以编程方式获取页面的屏幕截图_C#_Screenshot_Cutycapt_Iecapt

C# 以编程方式获取页面的屏幕截图

C# 以编程方式获取页面的屏幕截图,c#,screenshot,cutycapt,iecapt,C#,Screenshot,Cutycapt,Iecapt,我正在为内部使用编写一个专门的爬虫程序和解析器，我需要能够拍摄网页的屏幕截图，以便检查在整个过程中使用的颜色。该程序将接收大约10个网址，并将它们保存为位图图像从那里，我计划使用锁位来创建图像中五种最常用颜色的列表。据我所知，这是在网页中使用颜色的最简单方法，但如果有更简单的方法，请加入你的建议不管怎么说，在我看到价格标签之前，我一直打算用它。我对C#也是相当陌生的，只使用了几个月。有没有办法解决我的问题，即拍摄网页截图以提取配色方案？一种快速而肮脏的方法是使用WinForms控件并将其绘制

我正在为内部使用编写一个专门的爬虫程序和解析器，我需要能够拍摄网页的屏幕截图，以便检查在整个过程中使用的颜色。该程序将接收大约10个网址，并将它们保存为位图图像

从那里，我计划使用锁位来创建图像中五种最常用颜色的列表。据我所知，这是在网页中使用颜色的最简单方法，但如果有更简单的方法，请加入你的建议

不管怎么说，在我看到价格标签之前，我一直打算用它。我对C#也是相当陌生的，只使用了几个月。有没有办法解决我的问题，即拍摄网页截图以提取配色方案？

一种快速而肮脏的方法是使用WinForms控件并将其绘制为位图。在独立控制台应用程序中执行此操作有点棘手，因为在使用基本异步编程模式时，您必须了解托管控件的含义。但这里有一个工作的概念证明，它将网页捕获为800x600 BMP文件：

namespace WebBrowserScreenshotSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.Threading;
    using System.Windows.Forms;

    class Program
    {
        [STAThread]
        static void Main()
        {
            int width = 800;
            int height = 600;

            using (WebBrowser browser = new WebBrowser())
            {
                browser.Width = width;
                browser.Height = height;
                browser.ScrollBarsEnabled = true;

                // This will be called when the page finishes loading
                browser.DocumentCompleted += Program.OnDocumentCompleted;

                browser.Navigate("https://stackoverflow.com/");

                // This prevents the application from exiting until
                // Application.Exit is called
                Application.Run();
            }
        }

        static void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // Now that the page is loaded, save it to a bitmap
            WebBrowser browser = (WebBrowser)sender;

            using (Graphics graphics = browser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(browser.Width, browser.Height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
                browser.DrawToBitmap(bitmap, bounds);
                bitmap.Save("screenshot.bmp", ImageFormat.Bmp);
            }

            // Instruct the application to exit
            Application.Exit();
        }
    }
}

要编译此文件，请创建一个新的控制台应用程序，并确保为

System.Drawing

和

System.Windows.Forms

添加程序集引用

更新：我重写了代码，以避免使用hacky polling WaitOne/DoEvents模式。此代码应更接近以下最佳实践

更新2:您表示要在Windows窗体应用程序中使用此选项。在这种情况下，请忘记动态创建

WebBrowser

控件。您需要的是在表单上创建一个

WebBrowser

的隐藏（Visible=false）实例，并以与上面相同的方式使用它。下面是另一个示例，它显示了表单的用户代码部分，其中包含一个文本框（

webAddressTextBox

）、一个按钮（

generateScreenshotButton

）和一个隐藏浏览器（

webBrowser

）。在我进行这项工作时，我发现了一个我以前没有处理过的特性——DocumentCompleted事件实际上可以根据页面的性质引发多次。此示例通常可以使用，您可以扩展它以执行任何您想要的操作：

namespace WebBrowserScreenshotFormsSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.IO;
    using System.Windows.Forms;

    public partial class MainForm : Form
    {
        public MainForm()
        {
            this.InitializeComponent();

            // Register for this event; we'll save the screenshot when it fires
            this.webBrowser.DocumentCompleted += 
                new WebBrowserDocumentCompletedEventHandler(this.OnDocumentCompleted);
        }

        private void OnClickGenerateScreenshot(object sender, EventArgs e)
        {
            // Disable button to prevent multiple concurrent operations
            this.generateScreenshotButton.Enabled = false;

            string webAddressString = this.webAddressTextBox.Text;

            Uri webAddress;
            if (Uri.TryCreate(webAddressString, UriKind.Absolute, out webAddress))
            {
                this.webBrowser.Navigate(webAddress);
            }
            else
            {
                MessageBox.Show(
                    "Please enter a valid URI.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Exclamation);

                // Re-enable button on error before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // This event can be raised multiple times depending on how much of the
            // document has loaded, if there are multiple frames, etc.
            // We only want the final page result, so we do the following check:
            if (this.webBrowser.ReadyState == WebBrowserReadyState.Complete &&
                e.Url == this.webBrowser.Url)
            {
                // Generate the file name here
                string screenshotFileName = Path.GetFullPath(
                    "screenshot_" + DateTime.Now.Ticks + ".png");

                this.SaveScreenshot(screenshotFileName);
                MessageBox.Show(
                    "Screenshot saved to '" + screenshotFileName + "'.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Information);

                // Re-enable button before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void SaveScreenshot(string fileName)
        {
            int width = this.webBrowser.Width;
            int height = this.webBrowser.Height;
            using (Graphics graphics = this.webBrowser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(width, height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, width, height);
                this.webBrowser.DrawToBitmap(bitmap, bounds);
                bitmap.Save(fileName, ImageFormat.Png);
            }
        }
    }
}

是我最近唯一能找到的免费服务

您需要使用HttpWebRequest下载图像的二进制文件。有关详细信息，请参阅上面提供的url

HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest;
Bitmap bitmap;
using (Stream stream = request.GetResponse().GetResponseStream())
{
    bitmap = new Bitmap(stream);
}
// now that you have a bitmap, you can do what you need to do...

退房。这似乎可以满足您的需求，从技术上讲，它通过web浏览器控件以非常类似的方式解决问题。它似乎满足了要传入的一系列参数，并且内置了良好的错误处理功能。唯一的缺点是，它是一个外部进程（exe），由您生成，并创建一个物理文件，稍后您将读取该文件。从你的描述中，你甚至可以考虑WebServices，所以我不认为这是个问题。在解决您关于如何同时处理多个问题的最新评论时，这将是完美的。您可以在任何时候生成3、4、5或更多个并行进程，或者在另一个捕获进程发生时，将颜色位的分析作为线程运行

对于图像处理，我最近遇到过，我自己也没有用过，但它似乎很吸引人。它声称速度快，并且对图形分析有很多支持，包括读取像素颜色。如果我手头有任何图形处理项目，我会尝试一下

您还可以看看QT jambi

他们为浏览器提供了一个很好的基于webkit的java实现，您只需执行以下操作即可完成屏幕截图：

    QPixmap pixmap;
    pixmap = QPixmap.grabWidget(browser);

    pixmap.save(writeTo, "png");

看看这些示例-它们有一个很好的webbrowser演示。

有一个很棒的基于Webkit的浏览器PhantomJS，它允许从命令行执行任何JavaScript

从安装它，并从命令行执行以下示例脚本：

./phantomjs ../examples/rasterize.js http://www.panoramio.com/photo/76188108 test.jpg

它将在JPEG文件中创建给定页面的屏幕截图。这种方法的优点是，您不依赖任何外部提供商，可以轻松地大量自动截图。

我使用WebBrowser，它对我来说并不完美，特别是在需要等待JavaScript渲染完成时。我尝试了一些Api，发现Selenium最重要的一点是，它不需要StatThread，可以在简单的控制台应用程序和服务中运行

试一试：

class Program
{
    static void Main()
    {
        var driver = new FirefoxDriver();

        driver.Navigate()
            .GoToUrl("http://stackoverflow.com/");

        driver.GetScreenshot()
            .SaveAsFile("stackoverflow.jpg", ImageFormat.Jpeg);

        driver.Quit();
    }
}

这个问题很老了，但您也可以使用nuget包。它是免费的，使用了最新的GeckoWebBrowser（支持HTML5和CSS3），并且只存在于一个dll中

var screenshotJob = ScreenshotJobBuilder.Create("https://google.com")
              .SetBrowserSize(1366, 768)
              .SetCaptureZone(CaptureZone.FullPage) 
              .SetTrigger(new WindowLoadTrigger()); 

 System.Drawing.Image screenshot = screenshotJob.Freeze();

没有尝试过（这就是为什么这是一个评论，不是答案），但是（）似乎是一个将网页保存为位图的C#解决方案。你每个月爬行多少页？不多，我只是使用图像作为提取数据的手段，所以如果一个或两个失败，那就没什么大问题了。到目前为止，除了需要使用Application.Run（）来继续操作之外，我没有遇到任何问题。在这种情况下，我添加了一个我认为会很好用的答案，因为WebBrowser.DrawToBitmap非常不可靠。我在下面的答案中添加了另一个代码示例，以演示如何在Windows窗体应用程序中执行此操作。很抱歉，出现了巨大的延迟，代码似乎工作得很好，但我正在努力在现有的表单中使用它。我可能在做一些愚蠢的事情，但如果你能帮我一把，我将不胜感激。DrawToBitmap不受支持，有时会失败，留下空白的黑色或白色bitmap@bobbymcr-你是否碰巧知道为什么IE浏览器控件呈现的页面应用了一些错误的样式。@Jenea:我不能不看一个具体的例子。这可能取决于许多因素…@bobbymcr事实证明，问题在于IE控件的安全设置。若我禁用IE ESC，那个么页面将正常呈现。如果有