Regex-获取URL协议、主机、路径，但不获取文件名-PCRE 目标_Regex_Pcre_Url Scheme

Regex-获取URL协议、主机、路径，但不获取文件名-PCRE 目标

regex

Regex-获取URL协议、主机、路径，但不获取文件名-PCRE 目标,regex,pcre,url-scheme,Regex,Pcre,Url Scheme,替换主机和路径（位置），但保留文件名（它们保持不变）没有子域的URL-不工作这适用于至少有一个子域（例如“www.somedomain.com”），但仅使用域+TLD（例如“somedomain.com”）无法获取路径的主机（域）（http[s]？：\/\/（[^:\/\s]+）（\/\w+*\/）+ 在下面的HTML代码段中 junk before tag <img src="https://somedomain.com/wp-content/uploads/2017/10/som

替换主机和路径（位置），但保留文件名（它们保持不变）

没有子域的URL-不工作这适用于至少有一个子域（例如“www.somedomain.com”），但仅使用域+TLD（例如“somedomain.com”）无法获取路径的主机（域）

（http[s]？：\/\/（[^:\/\s]+）（\/\w+*\/）+

在下面的HTML代码段中

junk before tag <img src="https://somedomain.com/wp-content/uploads/2017/10/someimage.jpg" alt="" />Random text after

带有子域的URL-works 在下面的HTML代码段中（域有一个子域）

问题: 我如何调整正则表达式来为

img src=”“

包含子域的URL以及那些没有子域的URL捕获完整协议、域和路径（但不是文件名）

https?:\/\/(?:[^\/ ]*\/)*
演示
解释

http //Should start with http s? // s is optional :\/\/ // should follow up with :// (?: //START Non capturing group [^\/ ]* //Any character but a / or a space \/ //Ends with / ) //END Non capturing group * //Repeat non-capturing group
演示
解释

http //Should start with http s? // s is optional :\/\/ // should follow up with :// (?: //START Non capturing group [^\/ ]* //Any character but a / or a space \/ //Ends with / ) //END Non capturing group * //Repeat non-capturing group

在第二个示例中，您想返回
www.somedomain.com
？我不太清楚想要的输出到底是什么。在第一个示例中，我想要
https://somedomain/wp-content/uploads/2017/10/
，但我只得到了
https://somedomain/
。第二个示例按预期工作。因此，在第二个示例中，您希望返回
www.somedomain.com
？我不太清楚想要的输出到底是什么。在第一个示例中，我想要
https://somedomain/wp-content/uploads/2017/10/
，但我只得到了
https://somedomain/
。第二个示例按预期工作。
https?:\/\/(?:[^\/ ]*\/)*

http //Should start with http s? // s is optional :\/\/ // should follow up with :// (?: //START Non capturing group [^\/ ]* //Any character but a / or a space \/ //Ends with / ) //END Non capturing group * //Repeat non-capturing group