Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/295.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
从PHP数组中筛选出一组坏单词_Php - Fatal编程技术网

从PHP数组中筛选出一组坏单词

从PHP数组中筛选出一组坏单词,php,Php,我有一个大约20000个名字的PHP数组,我需要对其进行筛选,并删除名称中包含单词job、freeloper或project的任何名字 下面是到目前为止我已经开始的内容,它将在数组中循环并添加已清理的项以构建新的干净数组。我需要帮助匹配“坏”字。如果可以,请帮忙 $data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname'); // freelance // job // project $cle

我有一个大约20000个名字的PHP数组,我需要对其进行筛选,并删除名称中包含单词
job
freeloper
project
的任何名字

下面是到目前为止我已经开始的内容,它将在数组中循环并添加已清理的项以构建新的干净数组。我需要帮助匹配“坏”字。如果可以,请帮忙

$data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');

// freelance
// job
// project

$cleanArray = array();
foreach ($data1 as $name) {
    # if a term is matched, we remove it from our array
    if(preg_match('~\b(freelance|job|project)\b~i',$name)){
        echo 'word removed';

    }else{
        $cleanArray[] = $name;
    }

}

现在它匹配一个单词,因此如果数组中的名称为“freeloper”,它将删除该项,但如果它类似于
ImaFreelaner
,则它不匹配,我需要删除所有包含匹配单词的内容这应该是您想要的:

if (!preg_match('/(freelance|job|project)/i', $name)) {
    $cleanArray[] = $name;
}

这应该是您想要的:

if (!preg_match('/(freelance|job|project)/i', $name)) {
    $cleanArray[] = $name;
}

使用
preg_match()
函数和一些正则表达式应该可以做到这一点;这就是我想到的,在我这方面效果很好:

<?php
    $data1=array('JoomlaFreelance','PhillyWebJobs','web2project','cleanname');
    $cleanArray=array();
    $badWords='/(job|freelance|project)/i';
    foreach($data1 as $name) {
        if(!preg_match($badWords,$name)) {
            $cleanArray[]=$name;
        }
    }
    echo(implode($cleanArray,','));
?>

使用
preg_match()
函数和一些正则表达式应该可以做到这一点;这就是我想到的,在我这方面效果很好:

<?php
    $data1=array('JoomlaFreelance','PhillyWebJobs','web2project','cleanname');
    $cleanArray=array();
    $badWords='/(job|freelance|project)/i';
    foreach($data1 as $name) {
        if(!preg_match($badWords,$name)) {
            $cleanArray[]=$name;
        }
    }
    echo(implode($cleanArray,','));
?>

就我个人而言,我会这样做:

$badWords = ['job', 'freelance', 'project'];
$names = ['JoomlaFreelance', 'PhillyWebJobs', 'web2project', 'cleanname'];

// Escape characters with special meaning in regular expressions.
$quotedBadWords = array_map(function($word) {
    return preg_quote($word, '/');
}, $badWords);

// Create the regular expression.
$badWordsRegex = implode('|', $quotedBadWords);

// Filter out any names that match the bad words.
$cleanNames = array_filter($names, function($name) use ($badWordsRegex) {
    return preg_match('/' . $badWordsRegex . '/i', $name) === FALSE;
});

就我个人而言,我会这样做:

$badWords = ['job', 'freelance', 'project'];
$names = ['JoomlaFreelance', 'PhillyWebJobs', 'web2project', 'cleanname'];

// Escape characters with special meaning in regular expressions.
$quotedBadWords = array_map(function($word) {
    return preg_quote($word, '/');
}, $badWords);

// Create the regular expression.
$badWordsRegex = implode('|', $quotedBadWords);

// Filter out any names that match the bad words.
$cleanNames = array_filter($names, function($name) use ($badWordsRegex) {
    return preg_match('/' . $badWordsRegex . '/i', $name) === FALSE;
});

正则表达式在这里并不是真正必要的——使用几个调用可能会更快。(这一级别的性能很重要,因为搜索会针对20000个名称中的每一个进行。)

With,它仅在回调返回其
true
的数组中保留元素:

$data1 = array_filter($data1, function($el) {
        return stripos($el, 'job') === FALSE
            && stripos($el, 'freelance') === FALSE
            && stripos($el, 'project') === FALSE;
});

这是一个更具可扩展性/可维护性的版本,其中坏字列表可以从数组中加载,而不必在代码中显式表示:

$data1 = array_filter($data1, function($el) {
        $bad_words = array('job', 'freelance', 'project');
        $word_okay = true;

        foreach ( $bad_words as $bad_word ) {
            if ( stripos($el, $bad_word) !== FALSE ) {
                $word_okay = false;
                break;
            }
        }

        return $word_okay;
});

正则表达式在这里并不是真正必要的——使用几个调用可能会更快。(这一级别的性能很重要,因为搜索会针对20000个名称中的每一个进行。)

With,它仅在回调返回其
true
的数组中保留元素:

$data1 = array_filter($data1, function($el) {
        return stripos($el, 'job') === FALSE
            && stripos($el, 'freelance') === FALSE
            && stripos($el, 'project') === FALSE;
});

这是一个更具可扩展性/可维护性的版本,其中坏字列表可以从数组中加载,而不必在代码中显式表示:

$data1 = array_filter($data1, function($el) {
        $bad_words = array('job', 'freelance', 'project');
        $word_okay = true;

        foreach ( $bad_words as $bad_word ) {
            if ( stripos($el, $bad_word) !== FALSE ) {
                $word_okay = false;
                break;
            }
        }

        return $word_okay;
});

我倾向于使用该函数并将正则表达式更改为不匹配单词边界

$data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');

$cleanArray = array_filter($data1, function($w) { 
     return !preg_match('~(freelance|project|job)~i', $w); 
});

我倾向于使用该函数并将正则表达式更改为不匹配单词边界

$data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');

$cleanArray = array_filter($data1, function($w) { 
     return !preg_match('~(freelance|project|job)~i', $w); 
});

您正在匹配单词
\b
,同时需要匹配子字符串。是吗?您正在匹配单词
\b
,同时需要匹配子字符串。是吗?这可能是这里列出的最性感的解决方案。但我认为
~
应该用斜杠代替,即
/(自由职业者|项目|工作)/I'
。这可能是这里列出的最性感的解决方案。但是我认为应该用斜杠代替
~
,即
'/(自由职业者|项目|工作)/I'