正则表达式使用PHP从HTML中提取匹配的前3个实例
我有一个完整的PHP变量html文件。。并希望提取html中的前3个值,其格式为q?s=XXX或q?s=XX或q?s=XXXX(其中X是股票符号) $html变量包含:正则表达式使用PHP从HTML中提取匹配的前3个实例,php,html,regex,dom,Php,Html,Regex,Dom,我有一个完整的PHP变量html文件。。并希望提取html中的前3个值,其格式为q?s=XXX或q?s=XX或q?s=XXXX(其中X是股票符号) $html变量包含: <a name='mkt-movers' class='anchor'><\/a><h2 class='Fz-l Fw-200 Mend-4 D-i'>Market Movers<\/h2><\/div><div class=\"bd\">\t<div
<a name='mkt-movers' class='anchor'><\/a><h2 class='Fz-l Fw-200 Mend-4 D-i'>Market Movers<\/h2><\/div><div class=\"bd\">\t<div class=\"dropdown rapid-nf Fw-200 Bdrs\">\n <form class=\"SelectBox SelectBoxNoBorder\">\n <div class=\"SelectBox-Pick\">\n <span class=\"SelectBox-Text\">U.S. Composite<\/span>\n\t\t <i class='Icon'><\/i>\n <\/div>\n\n <select data-plugin=\"selectbox\" class='Start-0' name='selectBox' >\n\t\t <option value=\"0\" selected=\"selected\" class=\"Selected\">U.S. Composite<\/option><option value=\"1\" >Nasdaq<\/option><option value=\"2\" >NYSE Market<\/option><option value=\"3\" >NYSE<\/option>\n <\/select>\n <noscript>\n <Btn type=\"submit\" class=\"Hidden\">Select<\/Btn>\n <\/noscript>\n <\/form>\n\t<\/div><div class=\"content\"><div class=\"mod-85ac7b2b-640f-323f-a1c1-00b2f4865d18 mod active\"><div id=\"mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18\" class=\"yom-mod yom-app yom-data yfi-table wp yfi-mmovers fin-glass-disabled\">\n\t<a name=\"mkt-movers\" class=\"anchor\"><\/a>\n <div class=\"hd\">\n <h2 class=\"Fw-200 Fz-l M-0\"><\/h2>\n <\/div>\n <div class=\"bd yom-tabview\">\n <ul role=\"tablist\" data-plugin='tabpanel' class='FinTabs Mb-10'>\n <li class=\"Grid-U Mend-8 FinTab-Item Selected rmp-0\" role=\"presentation\">\n <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\" role = \"tab\" class = \"FinTab-Label no-pjax\" data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\" >Most Actives<\/a>\n <\/li>\n <li class=\"Grid-U Mend-8 FinTab-Item rmp-0\" role=\"presentation\">\n <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab2\" role = \"tab\" class = \"FinTab-Label no-pjax\" data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab2\" >% Gainers<\/a>\n <\/li>\n <li class=\"Grid-U Mend-8 FinTab-Item rmp-0\" role=\"presentation\">\n <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab3\" role = \"tab\" class = \"FinTab-Label no-pjax\" data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab3\" >% Losers<\/a>\n <\/li>\n <\/ul>\n\t<div class=\"yfi-panelcontainer yui3-tabview-panel\">\n <div role=\"tabpanel\" id=\"mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\" class=\" Selected\" data-start=\"0\" data-count=\"10\" data-content=\"mostactive\" >\n \t<div class=\"original\">\n \n <table summary=\"1\" class=\"yom-data col-8 phatable\" >\n <caption><\/caption>\n <colgroup><col><col><col><col><col><col><col><col><\/colgroup>\n <thead>\n <tr>\n <th id=\"table-31-0-0\" class=\"symbol txt-color\" scope=\"col\"><span>Symbol<\/span><\/th>\n <th id=\"table-31-0-1\" class=\"name txt-color\" scope=\"col\"><span>Company Name<\/span><\/th>\n
市场推动者\t\n\n\n美国综合指数\n\t\t\n\n\n\n\t\t美国复合材料纽约证券交易所市场纽约证券交易所\n\n\n选择\n\n\n\t\n\n\n\n\n\n\n\n- \n大多数活动\n\n
- \n%Gainers\n\n
- \n%Losers\n\n\t\n\n\t\n\n\n\n\n\n\n\n Symbol\n公司名称\n
我想在上面的完整HTML字符串中提取前3个股票符号。即输出=“BAC”、“GE”、“MSFT”
注-库存符号可以是1、2、3或4个字符长
如果您有任何想法,我们将不胜感激-谢谢 这应该可以
preg_match_all("/q\?s=([A-Za-z\.]{1,5})/",$html,$matches);
for ($i = 1; $i <= 3; $i++) {
if (isset($matches[$i])) {
echo $i;
}
}
preg_match_all(“/q\?s=([A-Za-z\.]{1,5})/”,$html,$matches);
对于($i=1;$i这应该有效,请尝试:
if(preg_match_all('~(?<=q\?s=)[-A-Z.]{1,5}~', $source, $out))
{
// The matches are in [0] (whole pattern)
echo "<pre>"; print_r($out[0]); echo "</pre>";
// If you need first 3
#$out[0] = array_slice($out[0],0,3);
#echo "<pre>"; print_r($out[0]); echo "</pre>";
// If you need them unique:
$out[0] = array_unique($out[0]);
echo "<pre>"; print_r($out[0]); echo "</pre>";
} else {
echo "FAIL";
}
if(preg_match_all(“~”)需要转换为preg_match_all答案已更新。谢谢-但不起作用:(我得到的匹配值是1,NULL。此外,如果这有助于诊断问题,我用实际的PHP变量内容更新了上面的帖子。@Robber不能逃避?
,否则它将充当一个,并使q
可选。因此应该是/q\-s=(\w{1,4})/code>或者更好的/q\-s=([-A-Z.]{1,5})/
可能是AKO.A
或AIV-Z
,但这不是问题所在。尝试了一下..出于某种原因仍然返回1,NULL。$html字符串内容现在是主要问题,仅供参考。太好了..效果非常好!谢谢。@ChicagoDude很高兴它能为您服务!