Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/apache-flex/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript 创建正则表达式以将html解析为MXML语法_Javascript_Regex - Fatal编程技术网

Javascript 创建正则表达式以将html解析为MXML语法

Javascript 创建正则表达式以将html解析为MXML语法,javascript,regex,Javascript,Regex,我在stackoverflow上搜索了很多,发现非常有趣的内容包括: 及 但事实证明,我无法真正解析用数据类型属性替换div并删除字符串上的数据类型属性的目标 我是这样做的 //Doesn't work with multi lines, just get first occurrency and nothing more. // Regex: /\s?data\-type\=(?:['"])?(\d+)(?:['"])?/ var source_code = $("body").html

我在stackoverflow上搜索了很多,发现非常有趣的内容包括:

但事实证明,我无法真正解析用数据类型属性替换div并删除字符串上的数据类型属性的目标

我是这样做的

//Doesn't work with multi lines, just get first occurrency and nothing more.
// Regex: /\s?data\-type\=(?:['"])?(\d+)(?:['"])?/

var source_code = $("body").html();

var rdiv = /div/gm; // remove divs
var mxml = source_code.match(/\S?data\-type\=(?:['"])?(\w+)(?:['"])?/);
var rattr =source_code.match(/\S?data\-type\=(?:['"])?(\w+)(?:['"])/gm);
var outra = source_code.replace(rdiv,'s:'+mxml[1]);
var nestr = outra.replace(rattr[0],'');// worked with only first element
console.log(nestr);
console.log(mxml);
console.log(rattr);
在这个HTML示例页面上

<div id="app" data-type="Application">
    <div data-type="Label"></div>
     <div data-type="Button"></div>
     <div data-type="VBox"></div>
     <div data-type="Group"></div>
</div>

有关于那件事的线索吗?我可能遗漏了什么,但我真的没有任何线索,这里没有剩余的空间

我已经创建了一个JSFIDLE来显示,只需打开浏览器的控制台就可以看到我随身携带的结果

请随意回答JSFIDLE或更好地解释我的正则表达式,解释它失败的原因

在我得到任何反馈之前,我会继续尝试,看看是否能替换文本


提前感谢。

将标记解析为对象树,然后将其转换为MXML可能会更容易

大概是这样的:

var source_code = $("body").html();

var openStartTagRx = /^\s*<div/i;
var closeStartTagRx = /^\s*>/i;
var closeTagRx = /^\s*<\/div>/i;
var attrsRx = new RegExp(
    '^\\s+' +
    '(?:(data-type)|([a-z-]+))' +    // group 1 is "data-type" group 2 is any attribute
    '\\=' +
    '(?:\'|")' +
    '(.*?)' +                        // group 3 is the data-type or attribute value
    '(?:\'|")',
    'mi');


function Thing() {
    this.type = undefined;
    this.attrs = undefined;
    this.children = undefined;
}

Thing.prototype.addAttr = function(key, value) {
    this.attrs = this.attrs || {};
    this.attrs[key] = value;
};

Thing.prototype.addChild = function(child) {
    this.children = this.children || [];
    this.children.push(child);
};


function getErrMsg(expected, str) {
    return 'Malformed source, expected: ' + expected + '\n"' + str.slice(0,20) + '"';
}


function parseElm(str) {

    var result,
        elm,
        childResult;

    if (!openStartTagRx.test(str)) {
        return;
    }
    elm = new Thing();
    str = str.replace(openStartTagRx, '');

    // parse attributes
    result = attrsRx.exec(str);
    while (result) {
        if (result[1]) {
            elm.type = result[3];
        } else {
            elm.addAttr(result[2], result[3]);
        }
        str = str.replace(attrsRx, '');
        result = attrsRx.exec(str);
    }

    // close off that tag
    if (!closeStartTagRx.test(str)) {
        throw new Error(getErrMsg('end of opening tag', str));
    }
    str = str.replace(closeStartTagRx, '');

    // if it has child tags
    childResult = parseElm(str);
    while (childResult) {
        str = childResult.str;
        elm.addChild(childResult.elm);
        childResult = parseElm(str);
    }

    // the tag should have a closing tag
    if (!closeTagRx.test(str)) {
        throw new Error(getErrMsg('closing tag for the element', str));
    }
    str = str.replace(closeTagRx, '');
    return {
        str: str,
        elm: elm
    };
}


console.log(parseElm(source_code).elm); 

它是递归的,所以嵌入式组也会被解析。

我不清楚您的目标是什么,我的目标是将html标记语法更改为mxml标记语法。我刚刚对内部值的转义数据类型“”感到迷茫。var x=document.getElementsByTagName('body')[0]。getElementsByTagName('*');对于(var i=0;我是否考虑过使用XSLT?(假设HTML标记格式正确)
{ 
  "type" : "Application"
  "attrs" : { "id" : "app" },
  "children" : [
    { "type" : "Label" },
    { "type" : "Button" },
    { "type" : "VBox" },
    { "type" : "Group" }
  ],
}