PHP-从文件导出名称和电子邮件地址_Php_Regex

PHP-从文件导出名称和电子邮件地址

php regex

PHP-从文件导出名称和电子邮件地址,php,regex,Php,Regex,我有一个文件，里面有人的名单，电话号码，电子邮件地址例如库特哈德萨莉·库特哈德地点：萨里专业知识涵盖：马、狗、马和骑手网站：www.veterinaryphysio.co.uk 电话：07865095005 电邮：sally@veterinaryphysio.co.uk 凯特·海恩斯地点：肯特郡萨塞克斯郡萨里专业知识涵盖：马、表演、马和骑手电话：07957344688 电邮：katehaynesphysio@yahoo.co.uk 这个列表与上面的数百个类似，我如何创建一个正

我有一个文件，里面有人的名单，电话号码，电子邮件地址

例如

库特哈德
萨莉·库特哈德
地点：萨里
专业知识涵盖：马、狗、马和骑手
网站：www.veterinaryphysio.co.uk
电话：07865095005
电邮：sally@veterinaryphysio.co.uk

凯特·海恩斯
地点：肯特郡萨塞克斯郡萨里
专业知识涵盖：马、表演、马和骑手
电话：07957344688
电邮：katehaynesphysio@yahoo.co.uk

这个列表与上面的数百个类似，我如何创建一个正则表达式，从上到下读取文件，提取名字、姓氏行和电子邮件地址，并将它们放在一起，如下所示

姓名、电子邮件地址

任何帮助都会很棒

我有下面的代码，但只读取电子邮件地址

$string = file_get_contents("physio.txt"); // Load text file contents

// don't need to preassign $matches, it's created dynamically

// this regex handles more email address formats like a+b@google.com.sg, and the i makes it case insensitive
$pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';

// preg_match_all returns an associative array
preg_match_all($pattern, $string, $matches);

// the data you want is in $matches[0], dump it with var_export() to see it
echo "<pre>";
$input = $matches[0];
echo count($input);
echo "<br>";
$result = array_unique($input);
echo count($result);
echo "<br>";
//print_r($result);
echo "</pre>";

$string=file_get_contents（“physio.txt”）；//加载文本文件内容
//不需要预先分配$matches，它是动态创建的
//这个正则表达式处理更多的电子邮件地址格式，比如+b@google.com.sg，而i使其不区分大小写
$pattern='/[a-z0-9\-\+]+@[a-z0-9\-]+\.（[a-z]{2,3}）（？：\.[a-z]{2}）？/i'；
//preg_match_all返回一个关联数组
preg_match_all（$pattern，$string，$matches）；
//您想要的数据在$matches[0]中，请使用var_export（）转储它以查看它
回声“；
$input=$MATCHS[0]；
回波计数（输入）；
回声“
”；
$result=array\u unique（$input）；
回声计数（结果）；
回声“
”；
//打印（结果）；
回声“；

您可以使用双换行符分割内容，然后处理每个块。要获取名字和姓氏，可以获取不包含

的最后一行：“

：

Sally Coulthard, sally@veterinaryphysio.co.uk
Kate Haynes, katehaynesphysio@yahoo.co.uk

或者，如果电子邮件不总是块的最后一行

~          #pattern delimiter
^          #match start of a line
(.+)       #capture one or more non-newline characters (Capture Group #1)
\R         #match a newline character (\r, \n, \r\n)
Location:  #match literal: "Location" followed by colon
[\s\S]*?   #match (lazily) zero or more of any character
^Email:    #match start of a line, literal: "Email", colon, space
(\S*)      #capture zero or more visible characters (Capture Group #2 -- quantifier means the email value can be blank and still valid)
~          #pattern delimiter
m          #pattern modifier tells regex engine that ^ means start of a line instead of start of the string

$blocks=explode（“\n\n，$string”）；
foreach（$block作为$block）{
$lines=分解（“\n”，$block）；
$lines=数组\反向（$lines）；
$fnln=''；
foreach（$line作为$line）{
如果（substr（$line，0，6）='Email:'）{
$mail=substr（$line，7）；
}
if（strpos（$line，：'）==false）{
$fnln=$line；
打破
}
}
echo$fnln.，“$mail.”
”；
}

Regex似乎是解析这些数据的一种合理方法。重要的是放入足够的组件以保持匹配的准确性

我建议如下：

模式：

~^（+）\r位置：[\s\s]*？^Email:（\s*）~m

（）

附近的子字符串

位置：

和

电子邮件：

用于确保目标子字符串正确

模式修饰符用于通过匹配行首（而不仅仅是字符串的开头）的

字符来提高模式精度

细分：

$input = "Coulthard
Sally Coulthard
Location: Surrey
Expertise Covered: Horse, Dog, Horse and Rider
Website: www.veterinaryphysio.co.uk
Tel: 07865095005
Email: sally@veterinaryphysio.co.uk

Kate Haynes
Location: Surrey, Sussex, Kent
Expertise Covered: Horse, Performance, Horse and Rider
Tel: 07957 344688
Email: katehaynesphysio@yahoo.co.uk";

if (preg_match_all("~^(.+)\RLocation:[\s\S]*?^Email: (\S*)~m", $input, $matches, PREG_SET_ORDER)) {
    foreach ($matches as $data) {
        echo "{$data[1]}, {$data[2]}\n";
    }
}

代码：（）

输出：

你试过什么？为什么失败了？你读过哪些资源？但严肃地说，您考虑过使用解析器吗？逐行读取文件，确保该行符合预期，并进行相应的处理。这里似乎比正则表达式简单得多。添加php代码以获取电子邮件地址每个列表之间总是有一行，总是有一个名字和姓氏-无论如何，要获取该部分中的部分和电子邮件地址，每个“条目”都会有这些行吗？条目之间是否有分隔行？与往常一样，简单有效：）谢谢您的解释，很有启发性！

~          #pattern delimiter
^          #match start of a line
(.+)       #capture one or more non-newline characters (Capture Group #1)
\R         #match a newline character (\r, \n, \r\n)
Location:  #match literal: "Location" followed by colon
[\s\S]*?   #match (lazily) zero or more of any character
^Email:    #match start of a line, literal: "Email", colon, space
(\S*)      #capture zero or more visible characters (Capture Group #2 -- quantifier means the email value can be blank and still valid)
~          #pattern delimiter
m          #pattern modifier tells regex engine that ^ means start of a line instead of start of the string

$input = "Coulthard
Sally Coulthard
Location: Surrey
Expertise Covered: Horse, Dog, Horse and Rider
Website: www.veterinaryphysio.co.uk
Tel: 07865095005
Email: sally@veterinaryphysio.co.uk

Kate Haynes
Location: Surrey, Sussex, Kent
Expertise Covered: Horse, Performance, Horse and Rider
Tel: 07957 344688
Email: katehaynesphysio@yahoo.co.uk";

if (preg_match_all("~^(.+)\RLocation:[\s\S]*?^Email: (\S*)~m", $input, $matches, PREG_SET_ORDER)) {
    foreach ($matches as $data) {
        echo "{$data[1]}, {$data[2]}\n";
    }
}

Sally Coulthard, sally@veterinaryphysio.co.uk
Kate Haynes, katehaynesphysio@yahoo.co.uk