Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/271.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript 如何从网页源抓取一个从这个到那个的字符串?_Javascript_Php - Fatal编程技术网

Javascript 如何从网页源抓取一个从这个到那个的字符串?

Javascript 如何从网页源抓取一个从这个到那个的字符串?,javascript,php,Javascript,Php,如何从网页源抓取一个从这个到那个的字符串?我已经浏览了整个PHP.net,但我无法确定PHP是否有一个函数或一组函数可以从这个字符串到那个字符串 例如,这就是我目前拥有的,我想从存储在$html中的网页中获取从wgCategories到WgMonthNameShort的所有内容: 最后,请注意,从wgCategories到wgmonthnameshort的所有内容都存储在标记之间,不确定这是否重要,但有人告诉我这值得一提 如果需要澄清,请告诉我。您可以使用preg_match with s fl

如何从网页源抓取一个从这个到那个的字符串?我已经浏览了整个PHP.net,但我无法确定PHP是否有一个函数或一组函数可以从这个字符串到那个字符串

例如,这就是我目前拥有的,我想从存储在$html中的网页中获取从wgCategories到WgMonthNameShort的所有内容:

最后,请注意,从wgCategories到wgmonthnameshort的所有内容都存储在标记之间,不确定这是否重要,但有人告诉我这值得一提

如果需要澄清,请告诉我。

您可以使用preg_match with s flag DOTALL来获取两个关键字之间的字符串:

error_reporting(E_ALL);
$html = file_get_contents('http://en.wikipedia.org/wiki/Los_Angeles');
if (preg_match('/wgCategories.*?wgMonthNamesShort/is', $html, $matches))
   echo $matches[0];
您可以避免使用正则表达式,并使用PHP字符串函数(如stristr)来实现这一点

以上代码打印:

工作组类别:[所有外部链接失效的文章,2013年3月的外部链接失效的文章,2014年3月的外部链接失效的文章,参考名称断开的页面,2014年1月的外部链接失效的文章,2011年9月的外部链接失效的文章,2011年10月的外部链接失效的文章,CS1错误:日期,使用2014年5月起的mdy日期、Wikipedia无限期半保护页面、Wikipedia无限期移动受保护页面、Wikidata上的坐标、包含记录发音的文章、包含西班牙语文本的文章、所有包含非来源声明的文章、2013年12月起包含非来源声明的文章、口语文章、带有h的文章音频微格式,加利福尼亚州洛杉矶,加利福尼亚州洛杉矶县的城市,美国66号公路上的社区,加利福尼亚州的县城,加利福尼亚州的合并城镇,加利福尼亚州的沿海人口聚居地,1781年建立的人口聚居地,美国太平洋海岸的港口城市和城镇,加利福尼亚州的巴特菲尔德陆路邮件ifornia,Stockton-洛杉矶路],wgBreakFrames:false,wgPageContentLanguage:en,wgPageContentModel:wikitext,WgSeparatorTransormTable:[,],WgDigitTransormTable:[,],wgDefaultDateFormat:dmy,wgMonthNames:[,一月,二月,三月,四月,五月,六月,七月,八月,九月,十月,十一月,十二月],wgMonthNamesShort


感谢@anubhava的帮助。有点离题了,但是你知道preg_match是否支持在第二个Wgmonthnameshort而不是第一个出现的地方停止吗?嗯,这使正则表达式更有趣。要匹配第二个实例,你需要使用:'/WGCories.*?Wgmonthnameshort.*?Wgmonthnameshort.*Wgmonthnameshort/is'
$string = "wgCategories":["All articles with dead external links","Articles with dead external links from March 2013","Articles with dead external links from March 2014","Pages with broken reference names","Articles with dead external links from January 2014","Articles with dead external links from September 2011","Articles with dead external links from October 2011","CS1 errors: dates","Use mdy dates from May 2014","Wikipedia indefinitely semi-protected pages","Wikipedia indefinitely move-protected pages","Coordinates on Wikidata","Articles including recorded pronunciations","Articles containing Spanish-language text","All articles with unsourced statements","Articles with unsourced statements from December 2013","Spoken articles","Articles with hAudio microformats","Los Angeles, California","Cities in Los Angeles County, California","Communities on U.S. Route 66","County seats in California","Incorporated cities and towns in California","Populated coastal places in California","Populated places established in 1781","Port cities and towns of the United States Pacific coast","Butterfield Overland Mail in California","Stockton - Los Angeles Road"],"wgBreakFrames":false,"wgPageContentLanguage":"en","wgPageContentModel":"wikitext","wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgMonthNamesShort";
error_reporting(E_ALL);
$html = file_get_contents('http://en.wikipedia.org/wiki/Los_Angeles');
if (preg_match('/wgCategories.*?wgMonthNamesShort/is', $html, $matches))
   echo $matches[0];