Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/xpath/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用YQL多查询&;XPath解析HTML,如何转义嵌套引号?_Xpath_Escaping_Quotes_Yql - Fatal编程技术网

使用YQL多查询&;XPath解析HTML,如何转义嵌套引号?

使用YQL多查询&;XPath解析HTML,如何转义嵌套引号?,xpath,escaping,quotes,yql,Xpath,Escaping,Quotes,Yql,标题比它必须要复杂得多,这是问题所在 SELECT * FROM query.multi WHERE queries=" SELECT * FROM html WHERE url='http://www.stumbleupon.com/url/http://www.guildwars2.com' AND xpath='//li[@class=\"listLi\"]/div[@class=\"views\"]/a/span';

标题比它必须要复杂得多,这是问题所在

SELECT * 
FROM query.multi 
WHERE queries="
    SELECT * 
        FROM html 
        WHERE url='http://www.stumbleupon.com/url/http://www.guildwars2.com' 
        AND xpath='//li[@class=\"listLi\"]/div[@class=\"views\"]/a/span';
    SELECT * 
        FROM xml 
        WHERE url='http://services.digg.com/1.0/endpoint?method=story.getAll&link=http://www.guildwars2.com';
    SELECT * 
        FROM json 
        WHERE url='http://api.tweetmeme.com/url_info.json?url=http://www.guildwars2.com';
    SELECT * 
        FROM xml 
        WHERE url='http://api.facebook.com/restserver.php?method=links.getStats&urls=http://www.guildwars2.com';
    SELECT * 
        FROM json 
        WHERE url='http://www.reddit.com/button_info.json?url=http://www.guildwars2.com'"
特别是这条线,

xpath='//li[@class=\"listLi\"]/div[@class=\"views\"]/a/span'
这是有问题的,因为引用,我必须把它们嵌套三层,我已经用光了引用字符。我尝试了以下几种变体,但没有成功:

//no attribute quoting
xpath='//li[@class=listLi]/div[@class=views]/a/span' 

//try to quote attribute w/ backslash & single quote
xpath='//li[@class=\'listLi\']/div[@class=\'views\']/a/span'

//try to quote attribute w/ backslash & double quote
xpath='//li[@class=\"listLi\"]/div[@class=\"views\"]/a/span'

//try to quote attribute with double single quotes, like SQL
xpath='//li[@class=''listLi'']/div[@class=''views'']/a/span'

//try to quote attribute with double double quotes, like SQL
xpath='//li[@class=""listLi""]/div[@class=""views""]/a/span'

//try to quote attribute with quote entities
xpath='//li[@class="listLi"]/div[@class="views"]/a/span'

//try to surround XPath with backslash & double quote
xpath=\"//li[@class='listLi']/div[@class='views']/a/span\"

//try to surround XPath with double double quote
xpath=""//li[@class='listLi']/div[@class='views']/a/span""
一切都没有成功

我没有看到太多关于转义XPath字符串的内容,但我发现的一切似乎都是关于使用concat(因为“或”都不可用)或html实体的变体。不使用属性引号不会引发错误,但会失败,因为它不是我需要的实际XPath字符串


我在YQL文档中没有看到任何关于如何处理转义的内容。我知道edge casey的情况,但希望他们能提供某种转义指南。

我提出了一个解决方案,它并没有真正回答我最初的问题,但确实解决了问题

该表将使用CSS选择器&将其解析为XPath,从而避免了令人讨厌的转义问题

SELECT *
FROM query.multi 
    WHERE queries="
        SELECT * 
            FROM data.html.cssselect 
            WHERE url='http://www.stumbleupon.com/url/http://www.guildwars2.com' 
            AND css='li.listLi div.views a span';
        SELECT * 
            FROM xml 
            WHERE url='http://services.digg.com/1.0/endpoint?method=story.getAll&link=http://www.guildwars2.com';
        SELECT * 
            FROM json 
            WHERE url='http://api.tweetmeme.com/url_info.json?url=http://www.guildwars2.com';
        SELECT * 
            FROM xml 
            WHERE url='http://api.facebook.com/restserver.php?method=links.getStats&urls=http://www.guildwars2.com';
        SELECT * 
            FROM json 
            WHERE url='http://www.reddit.com/button_info.json?url=http://www.guildwars2.com'"

您需要用双反斜杠转义XPath查询中的任何字符,换句话说:

SELECT * FROM query.multi 
WHERE queries="
    SELECT * 
        FROM html 
        WHERE url='http://www.stumbleupon.com/url/http://www.guildwars2.com' 
        AND xpath='//li[@class=\\'listLi\\']/div[@class=\\'views\\']/a/span';
    SELECT * 
        FROM xml 
        WHERE url='http://services.digg.com/1.0/endpoint?method=story.getAll&link=http://www.guildwars2.com';
    SELECT * 
        FROM json 
        WHERE url='http://api.tweetmeme.com/url_info.json?url=http://www.guildwars2.com';
    SELECT * 
        FROM xml 
        WHERE url='http://api.facebook.com/restserver.php?method=links.getStats&urls=http://www.guildwars2.com';
    SELECT * 
        FROM json 
        WHERE url='http://www.reddit.com/button_info.json?url=http://www.guildwars2.com'"

()

默认情况下,当我尝试在我的页面中使用它时,JS正在吃\\。我不得不做这些废话来让它工作,啊,明白了。但是很简陋。“从html中选择*,其中url='stumbleupon.com/url/%url%'和xpath='//li[@class=\\“+”\\'listLi\\\\“+”\']/div[@class=\\\“+”\\\'views\\\\\\\+']/a/span”奇怪的是,看起来data.html.cssselect比使用xpath从html中选择要快,尽管data.htmlcsselect只是转换为带xpath的从html中选择。古怪的