Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/329.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 查找“所有”的高效算法;“字符相等”;串?_Java_Php_Algorithm_Language Agnostic_Homoglyph - Fatal编程技术网

Java 查找“所有”的高效算法;“字符相等”;串?

Java 查找“所有”的高效算法;“字符相等”;串?,java,php,algorithm,language-agnostic,homoglyph,Java,Php,Algorithm,Language Agnostic,Homoglyph,我们如何编写一个输出输入字符串“”的有效函数 示例1(伪代码): 示例2: homoglyphs_list = [ ["rn", "m", "nn"], ] input_string = "rnn" output = ["rnn", "rm", "mn", "rrn", "nnn", "nm", "nrn"] homoglyphs_list = [

我们如何编写一个输出输入字符串“”的有效函数

示例1(伪代码):

示例2

homoglyphs_list = [
                     ["rn", "m", "nn"],
                  ]

input_string    = "rnn"
output          = ["rnn", "rm", "mn", "rrn", "nnn", "nm", "nrn"]
homoglyphs_list = [
                     ["d", "ci", "a"], // "o" and "0" are homoglyphs
                     ["i", "l", "1"] // "i" and "l" and "1" are homoglyphs
                  ]
                  /*
                     notice that with the two rules above,
                     we can infer "d" = "ci" = "a" = "cl" = "c1"
                  */

input_string    = "pacerier"
output          = [
                   "pacerier",  "pacerler",  "pacer1er",  "pcicerier",
                   "pcicerler", "pcicer1er", "pclcerier", "pc1cerier",
                   "pclcerler", "pc1cerler", "pclcer1er", "pc1cer1er",
                   "pdcerier",  "pdcerler",  "pdcer1er"
                  ]
示例3

homoglyphs_list = [
                     ["rn", "m", "nn"],
                  ]

input_string    = "rnn"
output          = ["rnn", "rm", "mn", "rrn", "nnn", "nm", "nrn"]
homoglyphs_list = [
                     ["d", "ci", "a"], // "o" and "0" are homoglyphs
                     ["i", "l", "1"] // "i" and "l" and "1" are homoglyphs
                  ]
                  /*
                     notice that with the two rules above,
                     we can infer "d" = "ci" = "a" = "cl" = "c1"
                  */

input_string    = "pacerier"
output          = [
                   "pacerier",  "pacerler",  "pacer1er",  "pcicerier",
                   "pcicerler", "pcicer1er", "pclcerier", "pc1cerier",
                   "pclcerler", "pc1cerler", "pclcer1er", "pc1cer1er",
                   "pdcerier",  "pdcerler",  "pdcer1er"
                  ]
注意:输出数组中成员的顺序并不重要,我们可以假设给定的同形符映射是正确的(输入不会给我们一个“无限循环”)

我目前的算法是可行的,但它使用的是原始的暴力,性能非常糟糕。例如,输入带有同形符的
[“rn”,“m”,“nn”]
需要38秒才能运行:

// This is php code (non-pseudo just so we could test the running time), 
// but the question remains language-agnostic

public function Func($in, Array $mappings){
    $out_table = array();
    $out_table[$in] = null;
    while(true){
        $number_of_entries_so_far = count($out_table);
        foreach(array_keys($out_table) as $key){
            foreach($mappings as $mapping){
                foreach($mapping as $value){
                    for($a=0, $aa=strlen($key); $a<$aa; ++$a){
                        $pos = strpos($key, $value, $a);
                        if($pos === false){
                            continue;
                        }
                        foreach($mapping as $equal_value){
                            if($value === $equal_value){
                                continue;
                            }
                            $out_table[substr($key, 0, $pos) . $equal_value . substr($key, $pos + strlen($value))] = null;
                        }
                    }

                }
            }
        }
        if($number_of_entries_so_far === count($out_table)){
            // it means we have tried bruteforcing but added no additional entries,
            // so we can quit now
            break;
        }
    }
    return array_keys($out_table);
}
//这是php代码(非伪代码,只是为了测试运行时间),
//但这个问题仍然与语言无关
公共函数Func($in,数组$mappings){
$out_table=array();
$out_table[$in]=null;
while(true){
$number\u of\u entries\u so\u far=计数($out\u table);
foreach(数组\u键($out\u表)作为$key){
foreach($mappings作为$mapping){
foreach($映射为$value){

对于($a=0,$aa=strlen($key);$a您应该从递归实现中获得一些性能,但我没有对实际性能进行太多考虑。这样可以避免多次循环输出字符串并在每次迭代中计算输出。此外,我还添加了一点“缓存”为了防止计算同一个同形符两次,为简单起见,它不会检查映射,但您可以将其实现到,或者在映射更改的每次调用之前清除缓存(但这很难看,而且很容易引入错误)

代码如下所示:

cache = {}
def homoglyph(input_string, mappings):
    output = []
    character = input_string[0]
    if input_string in cache:
        return cache[input_string]
    elif not input_string:
        return []
    for charset in mappings:
        if character in charset:
            temp = input_string
            sub_homoglyphs = homoglyph(input_string[1:], mappings)
            for char in charset:
                temp[0] = char
                output.append(temp)
                #adding the homoglyph character to the rest of the string
                for subhomoglyph in sub_homoglyphs:
                    output.append(char+subhomoglyph)
    cache[input_string] = output
    return output

(这是用python编写的,因为我对php语法不太熟悉,我实际上还没有运行过,所以可能会有错误,但逻辑要点就在那里)

如果我像
$homogyph\u mappings[0]=array(“n”,“nn”,“nnn”);
?@Rajan那样编写,我们可以假设输入是正确的(即,如果输入不正确,我们会得到未定义的行为)你的例子是一个输入错误的例子,因为它会给我们一个无限循环。如果
n=nn
,那么
nn=nnnn
,然后
nnnnnn=nnnnnnnnnn
,然后
nnnnnnnnnnnn=nnnnnnnnnnnn
,等等,无限…好吧,这是一个例外情况。@Rajan,是的,很容易发现侵权者,一个候选“x”不能声明等同于包含“x”的另一个候选项。