Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/google-app-engine/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/variables/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何从字符串中获取所需数据_Python - Fatal编程技术网

Python 如何从字符串中获取所需数据

Python 如何从字符串中获取所需数据,python,Python,例如,我有字符串 s = '\r\n<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> \r\n\r\n<p>\r\n\t\r\n\t\t<A HREF="../temp/Table 32012419252223.xls">Click to download</A>\r\n\r\n\t\r\n\t</P>' link = "www.example.com/fl

例如,我有字符串

s = '\r\n<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> \r\n\r\n<p>\r\n\t\r\n\t\t<A HREF="../temp/Table 32012419252223.xls">Click to download</A>\r\n\r\n\t\r\n\t</P>'
link = "www.example.com/flow/hardway/joshing/high"

现在我需要用第一个链接的结果(
/temp/Table 32012419252223.xls
)替换上面链接中的
“joshing/high”

如果要解析HTML或XML文档,请使用适当的库。使用lxml和xpath的示例如下:

from lxml.html.soupparser import fromstring
from urlparse import urljoin

s = 'yourhtml'
h = fromstring(s)
print urljoin(link, h.xpath('//a[1]/@href')[0]))

获取页面上的第一个链接。如果HTML更复杂,您也可以使用更复杂的XPath表达式。

请澄清:对于第1部分,如果我有一个不同的示例,我如何知道要从中提取什么?对于第二部分,如果我有一个不同的例子,我怎么知道链接的哪一部分应该被替换?