PHP使用UTF8在字符串和标记之间分解字符串_Php

PHP使用UTF8在字符串和标记之间分解字符串

php

PHP使用UTF8在字符串和标记之间分解字符串,php,Php,在php中，我想在字符串之间使用utf-8分解带标记的字符串，例如，在本文中： $content = "<heading>فهرست اول</heading>hi my name is mahdi whats app <heading>فهرست دوم</heading>how are you"; $content=“你好，我的名字是mahdi whats app”“你好吗？”；因为我必须在它们之间用utf8标记，所以我希望它们具有简单

在php中，我想在字符串之间使用utf-8分解带标记的字符串，例如，在本文中：

$content = "<heading>فهرست اول</heading>hi my name is mahdi  whats app <heading>فهرست دوم</heading>how are you";

$content=“你好，我的名字是mahdi whats app”“你好吗？”；

因为我必须在它们之间用utf8标记

，所以我希望它们具有简单的数组，如：

$arr[0] = "<heading>فهرست اول</heading>hi my name is mahdi  whats app";
$arr[1] = "<heading>فهرست دوم</heading>how are you";

$arr[0]=“我的名字是mahdi whats app”；
$arr[1]=“你好吗？”；

之间的字符串不同，如何创建此数组？问题是如何通过

ENY text

分解文本。您可以使用

preg\u match

，或者在您的情况下，使用

preg\u match\u all

：

$content = "<heading>فهرست اول</heading>hi my name is mahdi  whats app <heading>فهرست دوم</heading>how are you";

preg_match_all("'<heading>.*?<\/heading>'si", $content, $matches);
print_r($matches[0]);

$content=“你好，我的名字是mahdi whats app”“你好吗？”；
preg_match_all（“'.''si”，$content，$matches）；
打印（$matches[0]）；

给出：

Array
(
    [0] => <heading>فهرست اول</heading>
    [1] => <heading>فهرست دوم</heading>
)

数组
(
[0] => فهرست اول
[1] => فهرست دوم
)

您可以使用

preg\u match

，或者在您的情况下使用

preg\u match\u all

：

$content = "<heading>فهرست اول</heading>hi my name is mahdi  whats app <heading>فهرست دوم</heading>how are you";

preg_match_all("'<heading>.*?<\/heading>'si", $content, $matches);
print_r($matches[0]);

$content=“你好，我的名字是mahdi whats app”“你好吗？”；
preg_match_all（“'.''si”，$content，$matches）；
打印（$matches[0]）；

给出：

Array
(
    [0] => <heading>فهرست اول</heading>
    [1] => <heading>فهرست دوم</heading>
)

数组
(
[0] => فهرست اول
[1] => فهرست دوم
)

您可以使用

preg\u split

通过正则表达式拆分文本，然后使用

array\u filter

删除空字符串：

$arr=array\u过滤器（preg\u split（'/（？=.*？/，$contents），'strlen'）；

它不会删除标记，因为它位于一个

前瞻

组结构中，该组结构不使用它匹配的内容

例如：

<heading>فهرست اول</heading>hi my name is mahdi  whats app <heading>فهرست دوم</heading>how are you

你好，我叫马赫迪，你怎么样这应返回：

array(
  [0] => "<heading>فهرست اول</heading>hi my name is mahdi  whats app ",
  [1] => "<heading>فهرست دوم</heading>how are you"
)

数组(
[0]=>“嗨，我的名字是mahdi whats应用程序”，
[1] =>“你好吗？”
)

您可以在线查看此正则表达式：

或者，如果您愿意，可以看看PHP是如何解析它的：（链接似乎不起作用，因此，您必须手动粘贴它）：

https://en.functions-online.com/preg_split.html?command={“pattern”：“\/（？=*？）\/”，“subject”：“\u0641\u0647\u0631\u0633\u062a\u0627\u0648\u0644hi我的名字是mahdi whats app\u0641\u0647\u0631\u0633\u062a\u062f\u0648\u0645你好”，“limit”：-1}

您可以使用

preg\u split

通过正则表达式拆分文本，然后使用

array\u filter

删除空字符串：

$arr=array\u过滤器（preg\u split（'/（？=.*？/，$contents），'strlen'）；

它不会删除标记，因为它位于一个

前瞻

组结构中，该组结构不使用它匹配的内容

例如：

<heading>فهرست اول</heading>hi my name is mahdi  whats app <heading>فهرست دوم</heading>how are you

你好，我叫马赫迪，你怎么样这应返回：

array(
  [0] => "<heading>فهرست اول</heading>hi my name is mahdi  whats app ",
  [1] => "<heading>فهرست دوم</heading>how are you"
)

数组(
[0]=>“嗨，我的名字是mahdi whats应用程序”，
[1] =>“你好吗？”
)

您可以在线查看此正则表达式：

或者，如果您愿意，可以看看PHP是如何解析它的：（链接似乎不起作用，因此，您必须手动粘贴它）：

https://en.functions-online.com/preg_split.html?command={“pattern”：“\/（？=*？）\/”，“subject”：“\u0641\u0647\u0631\u0633\u062a\u0627\u0648\u0644hi我的名字是mahdi whats app\u0641\u0647\u0631\u0633\u062a\u062f\u0648\u0645你好”，“限制”：-1}

您可以尝试以下功能，它应该能很好地满足您的需要。基本上，您应该使用

作为分隔符拆分数组，结果数组中的每个项都是您需要的，但是标题标记将被剥离，因为它是您拆分的内容，所以您需要将其添加回去。有一些注释解释了代码的作用

function get_what_mahdi_wants($in_string){

  $mahdis_strings_array = array();

  // Split string at occurrences of '<heading>'
  $mahdis_strings = explode('<heading>', $in_string);
  foreach($mahdis_strings as $mahdis_string){

    // if '<heading>' is found at start of string, empty array element will be created. Skip it.
    if($mahdis_string == ''){ continue; }

    // Add back string element with '<heading>' tag prepended since exploding on it stripped it.
    $mahdis_strings_array[] = '<heading>'.$mahdis_string;
  }
  return $mahdis_strings_array;
}

函数获取mahdi想要的内容（$in_string）{
$mahdis_strings_array=array（）；
//在出现“”时拆分字符串
$mahdis_字符串=爆炸（“”，$in_字符串）；
foreach（$mahdisu字符串作为$mahdisu字符串）{
//如果在字符串开头找到“”，将创建空数组元素。请跳过它。
如果（$mahdis_string=''）{continue；}
//添加后面带有“”标记的字符串元素，因为在该元素上分解会将其剥离。
$mahdis_字符串_数组[]=''。$mahdis_字符串；
}
返回$mahdis_字符串_数组；
}

您可以尝试以下功能，它应该能很好地满足您的需求。基本上，您应该使用

function get_what_mahdi_wants($in_string){

  $mahdis_strings_array = array();

  // Split string at occurrences of '<heading>'
  $mahdis_strings = explode('<heading>', $in_string);
  foreach($mahdis_strings as $mahdis_string){

    // if '<heading>' is found at start of string, empty array element will be created. Skip it.
    if($mahdis_string == ''){ continue; }

    // Add back string element with '<heading>' tag prepended since exploding on it stripped it.
    $mahdis_strings_array[] = '<heading>'.$mahdis_string;
  }
  return $mahdis_strings_array;
}

函数获取mahdi想要的内容（$in_string）{
$mahdis_strings_array=array（）；
//在出现“”时拆分字符串
$mahdis_字符串=爆炸（“”，$in_字符串）；
foreach（$mahdisu字符串作为$mahdisu字符串）{
//如果在字符串开头找到“”，将创建空数组元素。请跳过它。
如果（$mahdis_string=''）{continue；}
//添加后面带有“”标记的字符串元素，因为在该元素上分解会将其剥离。
$mahdis_字符串_数组[]=''。$mahdis_字符串；
}
返回$mahdis_字符串_数组；
}

如果UTF导致问题，您可以使用strpos和Substr执行相同的操作

这将循环直到找不到更多的标题，然后在循环后添加最后一个Substr

我的名字是mahdi whats app我的名字是mahdi whats app你好吗； $oldpos=0； $pos=strpos（$content，“，1）；//偏移量1以排除第一个标题。而（$pos！==false）{ $arr[]=Substr（$content，$oldpos，$pos-$oldpos）； $oldpos=$pos； $pos=strpos（$content，“，$oldpos+1）；//偏移上一个位置+1以确保它不会再次捕获相同的位置 } $arr[]=Substr（$content，$oldpos）；//添加最后一个，因为它本身后面没有标题标记。 Var_转储（$arr）；