Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/xpath/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/maven/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Xpath 刮掉Instagram网页标签帖子_Xpath_Google Apps Script_Web Scraping_Google Sheets_Instagram - Fatal编程技术网

Xpath 刮掉Instagram网页标签帖子

Xpath 刮掉Instagram网页标签帖子,xpath,google-apps-script,web-scraping,google-sheets,instagram,Xpath,Google Apps Script,Web Scraping,Google Sheets,Instagram,我试图从给定的hashtag(#castles)中获取帖子数量,并使用ImportXML填充一个Google表单单元格 我尝试从Chrome复制Xpath并将其粘贴到单元格中的ImportXML参数,如下所示: =ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id="react-root"]/section/main/header/div[2]/div/div[2]/span/span") 我发现引号有问题

我试图从给定的hashtag(#castles)中获取帖子数量,并使用ImportXML填充一个Google表单单元格

我尝试从Chrome复制Xpath并将其粘贴到单元格中的ImportXML参数,如下所示:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id="react-root"]/section/main/header/div[2]/div/div[2]/span/span")
我发现引号有问题,所以我也尝试了:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id='react-root']/section/main/header/div[2]/div/div[2]/span/span")
然而,两者都返回一个错误

我做错了什么

另外,我知道meta标记description
“//meta[@name='description']/@content”
的Xpath,但是我想粗略估计文章的确切数量,而不是缩写数字。

试试这个-

function hashCount() {
  var url = 'instagram.com/explore/tags/cats/';
  var response = UrlFetchApp.fetch(url, {muteHttpExceptions: true}).getContentText();
  var regex = /(edge_hashtag_to_media":{"count":)(\d+)(,"page_info":)/gm;
  var count = regex.exec(response)[2];
  Logger.log(count);
}
演示-


我添加了
muteHttpExceptions:true
,这并没有添加到我上面的评论中。希望这能有所帮助。

基于应用程序脚本的解决方案是否可行,或者您是否希望仅通过使用
=IMPORTXML
函数来实现?我算出了公式,但它不适用于结果太大的警告
=REGEXEXTRACT(ImportXML(“https://www.instagram.com/explore/tags/cats/“,”//body/script[1]”,“edge_hashtag_to_media[[:punct:][][:punct:][][:punct:][]计数[[:punct:][][:punct:][](\d+)\,[:punct:][]页面信息[:punct:][]”
编辑注释:不起作用我很好奇。。。应用程序脚本是如何工作的?给你-这当然只是一个示例实现-函数hashCount(){var url='';var response=UrlFetchApp.fetch(url.getContentText();var regex=/(edge_hashtag_to_media):{“count”:(\d+,“page_info”:)/gm;var count=regex.exec(response)[2];Logger.log(count)}返回一个空值:-(