使用bashshell解析网站中的href标记_Bash_Shell_Curl

使用bashshell解析网站中的href标记

bash shell curl

使用bashshell解析网站中的href标记,bash,shell,curl,Bash,Shell,Curl,我有一个网站，里面有一个网址。这是一个href标签我需要解析一个网站以保留“href”值在此网站页面中，只有一个“href”标签。此“href”没有类名我使用带有卷曲的bash shell 目前，我尝试了以下方法： curl | grep“href=“| cut-d'>”-f4 | cut-d'如果您想保留href=部分 curl -s http://MyWebsite | grep -E -io 'href="[^\"]+"' 如果您只想要不带href= curl -s http://

我有一个网站，里面有一个网址。这是一个href标签

我需要解析一个网站以保留“href”值

在此网站页面中，只有一个“href”标签。此“href”没有类名

我使用带有卷曲的bash shell

目前，我尝试了以下方法：

curl | grep“href=“| cut-d'>”-f4 | cut-d'如果您想保留

href=

部分

curl -s http://MyWebsite | grep -E -io 'href="[^\"]+"'

如果您只想要不带

href=

curl -s http://MyWebsite | grep -E -io 'href="[^\"]+"' | awk -F\" '{print$2}'

我知道只有一个href，但以防万一。。。包含sed和grep的HTML文档中所有锚的URL：

curl -s http://MyWebsite  | grep -o '<a .*href=.*>' | sed -e 's/<a /\n<a /g' | sed -e 's/<a .*href=['"'"'"]//' -e 's/["'"'"'].*$//' -e '/^$/ d'

curl-shttp://MyWebsite  |grep-o'| sed-e's/