使用Javascript和grunt将网页源代码下载到文件中_Javascript_Node.js_Gruntjs

使用Javascript和grunt将网页源代码下载到文件中

javascript node.js gruntjs

使用Javascript和grunt将网页源代码下载到文件中,javascript,node.js,gruntjs,Javascript,Node.js,Gruntjs,我的任务有问题。我用GruntJs写了一些应用程序。我必须下载gruntJs的网页源代码例如，我有一个页面：example.com/index.html 我想在Grunt任务中给出URL，如下所示： scr:“example.com/index.html” 然后，我必须将这个源代码保存在文件中，ex:source.txt 我该怎么做这有几种方法首先是原始http.get，如注释中所述，它来自node.js API。这将获得页面初始加载所提供的原始源代码。当该站点在ajax请求之后大量使用

我的任务有问题。我用GruntJs写了一些应用程序。我必须下载gruntJs的网页源代码

例如，我有一个页面：

example.com/index.html

我想在Grunt任务中给出URL，如下所示：

scr:“example.com/index.html”

然后，我必须将这个源代码保存在文件中，

ex:source.txt

我该怎么做

这有几种方法

首先是原始

http.get

，如注释中所述，它来自node.js API。这将获得页面初始加载所提供的原始源代码。当该站点在ajax请求之后大量使用javascript构建进一步的html时，问题就出现了

第二种方法是使用实际的浏览器引擎加载站点，并执行页面加载时运行的任何javascript&进一步的HTML构建。最常见的引擎是，它被包装在一个名为Grunt的库中

幸运的是，有人在上面提供了另一层，几乎完全满足了您的要求：

上面链接中的示例配置：

grunt.initConfig({
    htmlSnapshot: {
        all: {
          options: {
            //that's the path where the snapshots should be placed
            //it's empty by default which means they will go into the directory
            //where your Gruntfile.js is placed
            snapshotPath: 'snapshots/',
            //This should be either the base path to your index.html file
            //or your base URL. Currently the task does not use it's own
            //webserver. So if your site needs a webserver to be fully
            //functional configure it here.
            sitePath: 'http://localhost:8888/my-website/',
            //you can choose a prefix for your snapshots
            //by default it's 'snapshot_'
            fileNamePrefix: 'sp_',
            //by default the task waits 500ms before fetching the html.
            //this is to give the page enough time to to assemble itself.
            //if your page needs more time, tweak here.
            msWaitForPages: 1000,
            //if you would rather not keep the script tags in the html snapshots
            //set `removeScripts` to true. It's false by default
            removeScripts: true,
            //he goes the list of all urls that should be fetched
            urls: [
              '',
              '#!/en-gb/showcase'
            ]
          }
        }
    }
});

看看。我试着用这个代码。但是什么都没发生<代码>grunt.log.writeln（res）；http.get（“http://www.google.com/index.html，函数（res）{grunt.log.writeln（“得到的响应：+res.statusCode）；}）.on（'error'，函数（e）{grunt.log.writeln（“得到的错误：+e.message）；}）您是否在http.get上遇到任何错误？