Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/javascript/475.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/76.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript 如何使用XPath获取列表标记作为对象数组?_Javascript_Html_Arrays_Node.js_Xpath - Fatal编程技术网

Javascript 如何使用XPath获取列表标记作为对象数组?

Javascript 如何使用XPath获取列表标记作为对象数组?,javascript,html,arrays,node.js,xpath,Javascript,Html,Arrays,Node.js,Xpath,我试图提取有序列表,并返回列表标记和内容数组。我已经试过这些方法了 //li[div/@class=“business info”] //li[div[@class=“business info”] //li[后代::div[@class=“business info”] //li[div[@class=“business info”]/h2/a] 这是正确的方法还是应该使用RegExp?我正在分享我的经验,以便进行深入研究 代码 const IGNORE = ['style', 'script

我试图提取有序列表,并返回列表标记和内容数组。我已经试过这些方法了

  • //li[div/@class=“business info”]
  • //li[div[@class=“business info”]
  • //li[后代::div[@class=“business info”]
  • //li[div[@class=“business info”]/h2/a]
  • 这是正确的方法还是应该使用
    RegExp
    ?我正在分享我的经验,以便进行深入研究

    代码

    const IGNORE = ['style', 'script'];
    const NONWHITESPACE_RE = /\S/;
    const result = document.evaluate(
        '//*[child::text()]',
        document,
        null,
        XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
        null
    );
    const businessInfo = [];
    for (let i = 0, j = result.snapshotLength; i < j; i++) {
        const element = result.snapshotItem(i);
        if (IGNORE.includes(element.tagName.toLowerCase())) {
            continue;
        }
        const nodes = [...element.childNodes];
        for (const node of nodes) {
            if (node.nodeType !== document.TEXT_NODE) {
                continue;
            }
            if (node.nodeValue.search(NONWHITESPACE_RE) === -1) {
                continue;
            }
            businessInfo.push({
                tag: element.tagName.toLowerCase(),
                text: node.nodeValue.trim()
            });
        }
    }
    console.log(businessInfo);
    
    这些是我用来参考的资料


  • 看起来我的案例不需要XPath,我通过在选择器的帮助下为每个属性找到一个
    innerText
    ,解决了这个问题。然后我用这些属性组成了一个对象数组。现在输出将是我实际期望的

    const businessInfo = [];
    const elements = document.querySelectorAll('ol > li > div.business-info');
    elements.forEach((element) => {
        const companyInfo = {};
        try {
            businessInfo.name = element.querySelector('h2 > a').innerText;
            businessInfo.phone = element.querySelector('a > span.phone').innerText;
            businessInfo.address = element.querySelector('div > span.address-main > span:nth-child(1)').innerText;
            businessInfo.locality = element.querySelector('div > span.address-main > span:nth-child(2)').innerText;
            businessInfo.postalCode = element.querySelector('div > span.postcode').innerText;
        } catch (exception) {
    
        }
        data.push(businessInfo);
    });
    console.log(businessInfo);
    

    到底是什么问题?您在JFIDLE中的代码似乎正常工作。@JackFleeting我希望输出类似于在此处共享的对象数组的内容。自动地,我不知道如何,例如,属性
    itemprop='streetAddress'
    如何映射到所需的对象属性
    address
    XPath表达式选择节点集或生成标量值(字符串、数字、布尔值)。数组的概念与集合不同。此外,节点集结果应该通过一个足够强大的API进行交换,以便转换为XPath数据模型并返回,就像DOM API一样。对于要查看的DOM和要查看的问题,使用CSS选择器或XPath表达式将是相同的,因为您已经逐个属性构建了自己的结果对象。
    const businessInfo = [
        {
            name: 'Company Ltd',
            phone: '0123456789',
            address: '21 Largo Road',
            locality: 'Focus',
            postal: 'KY168NH'
        },
        {
            name: 'Company Ltd1',
            phone: '0123456789',
            address: 'ECR Road',
            locality: 'St Andrews',
            postal: '800826'
        },
    ];
    
    const businessInfo = [];
    const elements = document.querySelectorAll('ol > li > div.business-info');
    elements.forEach((element) => {
        const companyInfo = {};
        try {
            businessInfo.name = element.querySelector('h2 > a').innerText;
            businessInfo.phone = element.querySelector('a > span.phone').innerText;
            businessInfo.address = element.querySelector('div > span.address-main > span:nth-child(1)').innerText;
            businessInfo.locality = element.querySelector('div > span.address-main > span:nth-child(2)').innerText;
            businessInfo.postalCode = element.querySelector('div > span.postcode').innerText;
        } catch (exception) {
    
        }
        data.push(businessInfo);
    });
    console.log(businessInfo);