Php 从字符串中查找并替换国家/地区

Php 从字符串中查找并替换国家/地区,php,preg-replace,Php,Preg Replace,我有一个国家的数组关键是国家代码,值是国家名称,现在我有一个字符串,由用户发布,我想找出字符串中是否有国家,然后用 <span class="country">$1</span> 我希望它是: <span class="country">canada</span> is a cold place 加拿大是个寒冷的地方 在这里,我使用我的国家/地区数组查找和重新定位 这背后的原因是我想使用微格式,所以我需要从字符串中提取特定的文本 我有类似的p

我有一个国家的数组关键是国家代码,值是国家名称,现在我有一个字符串,由用户发布,我想找出字符串中是否有国家,然后用

<span class="country">$1</span>
我希望它是:

<span class="country">canada</span> is a cold place
加拿大是个寒冷的地方 在这里,我使用我的国家/地区数组查找和重新定位

这背后的原因是我想使用微格式,所以我需要从字符串中提取特定的文本

我有类似的preg_代码

$style  = array(
                    '/\[b\](.*)?\[\/b\]/isU'            => '<b>$1</b>',
                    '/\[i\](.*)?\[\/i\]/isU'            => '<i>$1</i>',
                    '/\[u\](.*)?\[\/u\]/isU'            => '<u>$1</u>',
                    '/\[em\](.*)?\[\/em\]/isU'      => '<em>$1</em>',
                    '/\[li\](.*)?\[\/li\]/isU'      => '<li>$1</li>',
                    '/\[code\](.*)?\[\/code\]/isU'  => '<div class="tx_code">$1</div>',
                    '/\[q\](.*)?\[\/q\]/isU'    => '<q>$1</q>',
                    '/[\r\n]{3}+/'              => "\n"
                    );  

$text = preg_replace(array_keys($style),array_values($style),$text);
$style=array(
'/\[b\](.*)\[\/b\]/isU'=>“$1”,
'/\[i\](.*)\[\/i\]/isU'=>“$1”,
'/\[u\](.*)\[\/u\]/isU'=>“$1”,
'/\[em\](.*)\[\/em\]/isU'=>“$1”,
“/\[li\](.*)\[\/li\]/isU'=>”
  • $1
  • , '/\[code\](.*)\[\/code\]/isU'=>“$1”, '/\[q\](.*)\[\/q\]/isU'=>“$1”, '/[\r\n]{3}+/'=>“\n” ); $text=preg_replace(数组_键($style)、数组_值($style)、$text);
    这样行得通,我需要这样的东西

    请记住,它不应该区分大小写,有些用户可能会在加拿大或加拿大境内投递邮件

    谢谢你试试这个

      function findword($text,array $List){
             foreach($List as $Val)
                $pattern['%([^\da-zA-Z]+)'.$Val.'([^\da-zA-Z]+)%si'] = '<span class="country">'.$Val.'</span>';
             $text = preg_replace(array_keys($pattern), array_values($pattern), ' '.$text.' ');
             return $text;
      }
      echo findword('Canada is a cold place',array('Canada'));
    
    函数findword($text,array$List){
    foreach($列为$Val)
    $pattern['%([^\da-zA-Z]+)。$Val.([^\da-zA-Z]+)%si']='.$Val.';
    $text=preg_replace(数组_键($pattern),数组_值($pattern),“.$text.”);
    返回$text;
    }
    echo findword(“加拿大是一个寒冷的地方”,array(“加拿大”);
    
    输出:

    <span class="country">Canada</span>is a cold place
    
    <span class="country">Canada</span>isacold place
    
    Canadais是个寒冷的地方
    
    编辑:如果要替换文本中的所有匹配词,可以使用此选项

      function findword($text,array $List){
             foreach($List as $Val)
                $pattern['~'.$Val.'~si'] = '<span class="country">'.$Val.'</span>';
             $text = preg_replace(array_keys($pattern), array_values($pattern), ' '.$text.' ');
             return $text;
      }
      echo findword('Canadaisacold place',array('Canada'));
    
    函数findword($text,array$List){
    foreach($列为$Val)
    $pattern['~'.$Val.~si']='.$Val.';
    $text=preg_replace(数组_键($pattern),数组_值($pattern),“.$text.”);
    返回$text;
    }
    echo findword('Canadaisacold place',数组('Canadaisacold');
    
    输出:

    <span class="country">Canada</span>is a cold place
    
    <span class="country">Canada</span>isacold place
    
    Canadaisacold place
    
    Edit2:我是用DOMDocument写的,它在Html中工作得很好

     class XmlRead{    
      static function Clean($html){
       $html=preg_replace_callback("~<script(.*?)>(.*?)</script>~si",function($m){
          //print_r($m);
         // $m[2]=preg_replace("/\/\*(.*?)\*\/|[\t\r\n]/s"," ", " ".$m[2]." ");
          $m[2]=preg_replace("~//(.*?)\n~si"," ", " ".$m[2]." ");
          //echo $m[2];
          return "<script ".$m[1].">".$m[2]."</script>";
          }, $html);
      $search = array(
          "/\/\*(.*?)\*\/|[\t\r\n]/s" => "",
          "/ +\{ +|\{ +| +\{/" => "{",
          "/ +\} +|\} +| +\}/" => "}",
          "/ +: +|: +| +:/" => ":",
          "/ +; +|; +| +;/" => ";",
          "/ +, +|, +| +,/" => ","
          );
          $html = preg_replace(array_keys($search), array_values($search), $html);
        preg_match_all('!(<(?:code|pre|script).*>[^<]+</(?:code|pre|script)>)!',$html,$pre);
        $html = preg_replace('!<(?:code|pre).*>[^<]+</(?:code|pre)>!', '#pre#', $html);
        $html = preg_replace('#<!–[^\[].+–>#', '', $html);
        $html = preg_replace('/[\r\n\t]+/', ' ', $html);
        $html = preg_replace('/>[\s]+</', '><', $html);
        $html = preg_replace('/\s+/', ' ', $html);
        if (!empty($pre[0])) {
            foreach ($pre[0] as $tag) {
                $html = preg_replace('!#pre#!', $tag, $html,1);
            }
        }
        return($html);
    }
    function loadNprepare($content,$encod='') {
       $content=self::Clean($content);
       //$content=html_entity_decode(html_entity_decode($content));
      // $content=htmlspecialchars_decode($content,ENT_HTML5);
       $DataPage='';
       if(preg_match('~<body(.*?)>(.*?)</body>~si',$content,$M)){
          $DataPage=$M[2];
       }else{
          $DataPage =$content;
       }
       $HTML=$DataPage;
       $HTML="<!doctype html><html><head><meta charset=\"utf-8\"><title>Untitled Document</title></head><body>".$HTML."</body></html>";
       $dom= new DOMDocument; 
       $HTML = str_replace("&", "&amp;", $HTML);  // disguise &s going IN to loadXML() 
      // $dom->substituteEntities = true;  // collapse &s going OUT to transformToXML() 
       $dom->recover = TRUE;
       @$dom->loadHTML('<?xml encoding="UTF-8">' .$HTML); 
       // dirty fix
       foreach ($dom->childNodes as $item)
        if ($item->nodeType == XML_PI_NODE)
          $dom->removeChild($item); // remove hack
        $dom->encoding = 'UTF-8'; // insert proper
        return $dom;
    }
    function GetBYClass($Doc,$ClassName){
        $finder = new DomXPath($Doc);
        return($finder->query("//*[contains(@class, '$ClassName')]"));
    }
    function findword($text,array $List){
         foreach($List as $Val)
            $pattern['%(\#)?([^\da-zA-Z]+)'.$Val.'([^\da-zA-Z]+)%si'] = '<span class="country">'.$Val.'</span>';
         $text = preg_replace(array_keys($pattern), array_values($pattern), ' '.$text.' ');
         return $text;
    }
    function FindAndReplace($node,array $List) {
         if($node==NULL)return false;    
         if (XML_TEXT_NODE === $node->nodeType || XML_CDATA_SECTION_NODE === $node->nodeType) {
             $node->nodeValue=$this->findword($node->nodeValue,$List);
             return;
         }else{
             if(is_object($node->childNodes) or is_array($node->childNodes)) {
               foreach($node->childNodes as $childNode) {
                  $this->FindAndReplace($childNode,$List);
               }
             }
         }
    
    }
    function DOMinnerHTML($element) 
    { 
       $innerHTML = ""; 
       $children = $element->childNodes; 
       foreach ($children as $child) 
       { 
          $tmp_dom = new DOMDocument(); 
          $tmp_dom->appendChild($tmp_dom->importNode($child, true)); 
          $innerHTML.=trim($tmp_dom->saveHTML()); 
       } 
       $innerHTML=html_entity_decode(html_entity_decode($innerHTML));
       return $innerHTML; 
    } 
    function DOMRemove(DOMNode $from) {
    
        $from->parentNode->removeChild($from);    
     }
    
    }
    $XmlRead=new XmlRead();
    $Doc=$XmlRead->loadNprepare('<a href="?Canada">Canada</a> is a cold place');
    $XmlRead->FindAndReplace($Doc,array('Canada'));
    $Body=$Doc->getElementsByTagName('body')->item(0);
    echo $XmlRead->DOMinnerHTML($Body);
    
    类XmlRead{
    静态函数Clean($html){
    $html=preg\u replace\u回调(“~(.*?)~si”,函数($m){
    //印刷费(百万美元);
    //$m[2]=preg\u replace(“/\/\*(.*?\*/\/\\/\\”[\t\r\n]/s“,”,“.$m[2]”);
    $m[2]=preg_replace(“~/(.*?)\n~si“,”,“.$m[2]”);
    //echo$m[2];
    返回“$m[2]”;
    },$html);
    $search=array(
    “/\/\*(.*?\*/\\\/\;[\t\r\n]/s”=>”,
    "/ +\{ +|\{ +| +\{/" => "{",
    "/ +\} +|\} +| +\}/" => "}",
    "/ +: +|: +| +:/" => ":",
    "/ +; +|; +| +;/" => ";",
    "/ +, +|, +| +,/" => ","
    );
    $html=preg_replace(数组_键($search),数组_值($search),$html);
    
    preg_match_all(')!([^我自己写的,是迄今为止最好的:

        if($microformat){
            foreach ($this->countries as $co){
            $text = preg_replace('/(\#)?\b'.$co.'\b/isU','<span class="country">$0</span>',$text);
            }
        }
    
    if($microformat){
    foreach($this->countries as$co){
    $text=preg\u replace(“/(\)?\b'$co.\b/isU',“$0',$text);
    }
    }
    

    谢谢大家

    所以,你们已经编写了类似的代码。到目前为止,你们是如何尝试适应的?我不清楚问题是什么。感谢你们的提示,在努力寻找其中一个答案后,我写了我的,它工作得很好。世界上所有国家,字符串长度最大为500个字符,顺便说一句,如果我写这个:Incadaisacoldplace,带no空格,它不会超出rightfixed:$pattern[''%('.$Val.)%si']=''.$Val.'
    incadaisacoldplace
    是一个不能用
    Canada
    替换的词,没关系,它不会破坏措辞好吧,发生了另一个问题,有时我会在标签中添加文本,比如,正则表达式替换href属性n中的文本,使其#