Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/245.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php RegExp:将除youtube链接之外的所有链接与视频匹配_Php_Regex_Youtube_Regex Negation_Negative Lookahead - Fatal编程技术网

Php RegExp:将除youtube链接之外的所有链接与视频匹配

Php RegExp:将除youtube链接之外的所有链接与视频匹配,php,regex,youtube,regex-negation,negative-lookahead,Php,Regex,Youtube,Regex Negation,Negative Lookahead,到目前为止: (?(定义) #网址 (?(https?:)?\/\/) (? :[0-9]{2,5}) (?(?:com | net | info | biz | us | org)) (?(\/([a-z0-9+%-]\.?)+)*\/?) (?\?[a-z+&$\.-][a-z0-9;:@&%=+\/.-]*) (?\\\\[a-z.-][a-z0-9+$%\.-]*) (?([a-z0-9\-\.]+)\) #例外情况 (?(www\)?(youtube\.com | youtu\.be)

到目前为止:

(?(定义)
#网址
(?(https?:)?\/\/)
(? :[0-9]{2,5})
(?(?:com | net | info | biz | us | org))
(?(\/([a-z0-9+%-]\.?)+)*\/?)
(?\?[a-z+&$\.-][a-z0-9;:@&%=+\/.-]*)
(?\\\\[a-z.-][a-z0-9+$%\.-]*)
(?([a-z0-9\-\.]+)\)
#例外情况
(?(www\)?(youtube\.com | youtu\.be)\/)
(?([\w-]{10,12})+)
(?\g?\g+(手表)?(\/embed\/\124;\?v=)+\g+)
)
#俘获
((?!\g+.*)
(\g?
\g
\g
\g?
\g?
\g?
\g?
))
我设法捕获了任何格式的链接,但出于某种原因,我的负面前瞻(见
\g
)并没有将youtube视频链接从匹配列表中排除

应部分或完全匹配的行:

http:www.google.com/
http://www.google.com/

://www.google.com/
www.google.com/
www.google.com:8000
www.google.com/?key=value
github.io
www.google.com/abc/def/ijk#123
www.google.com/abc/def/ijk?v=123123
www.google.com/abc/def/watch?v=1231


但是,它应该跳过(与包含youtube视频ID的行不匹配):



music.youtube.com/embed/y19EaW2X7ac




提前感谢您提供的帮助或任何提示,说明为什么消极前瞻不会否定行。

在对其进行了一些处理之后,您诊断错误的方法不是youtube
函数的作用是注释掉它的其余部分,并查看它匹配的内容

关于前瞻性断言,您必须了解的是它们告诉您的一切
引擎是指在当前位置不可能有前面的东西。
它所做的只是将该位置提前1,然后重试。
直到到达断言传递的位置为止

因为您没有任何锚,所以它会将部分线路与其他线路匹配

所以,你必须通过这段文字来避免部分内容

有几种方法可以解决这个问题,但到目前为止,最简单的方法是匹配它
(*跳过)(*失败)
跳过它。
引擎实际上并不匹配它,但它将当前位置置于刚刚过去的位置
然后再试一次

我已摆脱(或转换为群集)不必要的捕获组。
添加跳过/失败,将TLD转换为三元trie并格式化为
阅读目的

获取为您执行此操作并具有
用于正则表达式测试的内置引擎

还请注意,这个
(?:[\w-]{10,12})+
的粒度一次为10-12个字符。其中as
[\w-]{10,}
将匹配>10个字符。它位于
功能中。用这个
(?&yt_hash)+
调用它时,量词是多余的/无用的

因为它现在成功地跳过了对
(?&yt\u视频)的调用中的一些内容。*

您必须调查该函数的各个部分,以了解为什么它没有
匹配,因此跳过其他

这里是压缩的

(2.5)目前,,([代码><代码>///////,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,((((((((((:::[cd[cd:::::::::[cd[cd[cd[[cd]cd]cd]cd]cd)以及e,e(e(?::::::::::::::::::::::::::::::::::::::::::::::::::::::::g[abd il np uwy]| h[kmnrtu]| i(?[delm]| n(?:fo | t)|[oq-t]| j(?[em]| o(?:bs)| p)| k[g imnprwyz]| l[a-cikr-vy]| m(?[ac hk]| lc | o(?:bi |[p-t]|u(当时是:修修修业的(当时是:修修修业的)(当时是:修修业的(当时是:卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖)的(v-z)款[v-z)|[v-z]|[v-z]本村)的(v-z)本村)和其他((当时当时当时是)的(当时当时是(当时当时是)的(当时当时是)的(当时是(当时是(当时是)和(当时是)的)和(中国(当时是(当时是(当时是)的)的)和(当时(当时是(当时是(当时)的)和(当时)的)存存存存存存存存存存)的)的)的,(,(,(,(,(,(,(当时)卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖卖|w[fs]|y[et]|z[amw])(?(\/(?:[a-z0-9+-]\.+)*/?)(?\?[a-z+&$\.-][a-z0-9;:@&%=+\/.-])(?\[a-z.-][a-z0-9+$%%.*)(?[a-z0-9+.+.-)(?)(?)(?:::)?(?:youtube\.com | youtu\.be)\/)(?:[\w-]{10,12})(?(?&proto)(?&yt\u域)+(?:watch)(?:\/embed\/| v=)+(?&yt-hash)+(((?&yt video)*(*跳过)(*失败)(?&proto)(?&subdomain)(?&tld)(?&port)(?&path)(?&query hash)(?&query)/ 扩大

 (?i)
 (?(DEFINE)
      # URL
      (?<proto>                                          # (1 start)
           (?: https?: )?
           //
      )                                                  # (1 end)
      (?<port> : [0-9]{2,5} )                            # (2)
      (?<tld>                                            # (3 start)
           (?:
                a
                (?:
                     [cd] 
                  |  e
                     (?: ro )?
                  |  [fgil-oqr] 
                  |  s
                     (?: ia )?
                  |  [tuwxz] 
                )
             |  b
                (?: [abd-h] | iz? | [jl-oq-tvwyz] )
             |  c
                (?:
                     at?
                  |  [cdf-ik-n] 
                  |  o
                     (?: m | op )?
                  |  [ru-z] 
                )
             |  d [ejkmoz] 
             |  e [ceghr-u] 
             |  f [i-kmor] 
             |  g [abd-il-np-uwy] 
             |  h [kmnrtu] 
             |  i
                (?:
                     [delm] 
                  |  n
                     (?: fo | t )?
                  |  [oq-t] 
                )
             |  j
                (?:
                     [em] 
                  |  o
                     (?: bs )?
                  |  p
                )
             |  k [eg-imnprwyz] 
             |  l [a-cikr-vy] 
             |  m
                (?:
                     [ac-hk] 
                  |  lc?
                  |  [mn] 
                  |  o
                     (?: bi )?
                  |  [p-t] 
                  |  u
                     (?: seum )?
                  |  [v-z] 
                )
             |  n
                (?:
                     a
                     (?: me )?
                  |  c
                  |  et?
                  |  [fgilopruz] 
                )
             |  o
                (?: m | rg )
             |  p
                (?: [ae-hk-n] | ost | ro? | [stwy] )
             |  qa
             |  r [eosuw] 
             |  s
                (?:
                     [a-eg-or] 
                  |  t
                     (?: udio )?
                  |  [uvx-z] 
                )
             |  t
                (?:
                     [cd] 
                  |  el
                  |  [f-hj-p] 
                  |  r
                     (?: avel )?
                  |  [tvwz] 
                )
             |  u [agkmsyz] 
             |  v [aceginu] 
             |  w [fs] 
             |  y [et] 
             |  z [amw] 
           )

      )                                                  # (3 end)
      (?<path>                                           # (4 start)
           (                                                  # (5 start)
                /
                (?: [a-z0-9+%-] \.? )+
           )*                                                 # (5 end)
           /?
      )                                                  # (4 end)
      (?<query> \? [a-z+&$_.-] [a-z0-9;:@&%=+/.-]* )     # (6)
      (?<hash> \# [a-z_.-] [a-z0-9+$%_.-]* )             # (7)
      (?<subdomain>                                      # (8 start)
           [a-z0-9\-\.]+ 
           \.
      )                                                  # (8 end)

      # Exceptions
      (?<yt_domain>                                      # (9 start)
           (?: www\. )?
           (?: youtube\.com | youtu\.be )
           /
      )                                                  # (9 end)
      (?<yt_hash>                                        # (10 start)
           (?: [\w-]{10,12} )+
      )                                                  # (10 end)
      (?<yt_video>                                       # (11 start)
           (?&proto)? (?&yt_domain)+ 
           (?: watch )?
           (?: /embed/ | \?v= )+
           (?&yt_hash)+ 
      )                                                  # (11 end)
 )

 # Capture
 (                                                  # (12 start)
      (?&yt_video) .* 
      (*SKIP) (*FAIL) 
   |  
      (?&proto)? 
      (?&subdomain) 
      (?&tld) 
      (?&port)? 
      (?&path)? 
      (?&query)? 
      (?&hash)? 
 )                                                  # (12 end)
(?i)
(?(定义)
#网址
(?#(1开始)
(?:https?:)?
//
)#(一完)
(? : [0-9]{2,5} )                            # (2)
(?#(3开始)
(?:
A.
(?:
[光盘]
|e
(?:ro)?
|[fgil-oqr]
|
(?:ia)?
|[tuwxz]
)
|b
(?:[abd-h]| iz?|[jl oq tvwyz])
|c
(?:
在
|[cdf-ik-n]
|o
(?:m|op)?
|[ru-z]
)
|d[ejkmoz]
|e[ceghr-u]
|f[i-kmor]
|g[abd il np uwy]
|h[kmnrtu]
|我
(?:
[delm]
|n
(?:fo|t)?
|[oq-t]
)
|j
(?:
[em]
|o
(?:bs)?
|p
)
|k[例如imnprwyz]
|l[a-cikr-vy]
|m
(?:
[交流香港]
|信用证?
|[mn]
|o
(?:bi)?
|[p-t]
|u
(?:seum)?
|[v-z]
)
|n
(?:
A.
(?:我)?
|c
|et?
|[fgilopruz]
)
|o
(?:m|rg)
|p
(?:[ae-hk-n]| ost | ro |[stwy])
|质量保证
|r[e
/(?i)(?(DEFINE)(?<proto>(?:https?:)?\/\/)(?<port>:[0-9]{2,5})(?<tld>(?:a(?:[cd]|e(?:ro)?|[fgil-oqr]|s(?:ia)?|[tuwxz])|b(?:[abd-h]|iz?|[jl-oq-tvwyz])|c(?:at?|[cdf-ik-n]|o(?:m|op)?|[ru-z])|d[ejkmoz]|e[ceghr-u]|f[i-kmor]|g[abd-il-np-uwy]|h[kmnrtu]|i(?:[delm]|n(?:fo|t)?|[oq-t])|j(?:[em]|o(?:bs)?|p)|k[eg-imnprwyz]|l[a-cikr-vy]|m(?:[ac-hk]|lc?|[mn]|o(?:bi)?|[p-t]|u(?:seum)?|[v-z])|n(?:a(?:me)?|c|et?|[fgilopruz])|o(?:m|rg)|p(?:[ae-hk-n]|ost|ro?|[stwy])|qa|r[eosuw]|s(?:[a-eg-or]|t(?:udio)?|[uvx-z])|t(?:[cd]|el|[f-hj-p]|r(?:avel)?|[tvwz])|u[agkmsyz]|v[aceginu]|w[fs]|y[et]|z[amw]))(?<path>(\/(?:[a-z0-9+%-]\.?)+)*\/?)(?<query>\?[a-z+&$_.-][a-z0-9;:@&%=+\/.-]*)(?<hash>\#[a-z_.-][a-z0-9+$%_.-]*)(?<subdomain>[a-z0-9\-\.]+\.)(?<yt_domain>(?:www\.)?(?:youtube\.com|youtu\.be)\/)(?<yt_hash>(?:[\w-]{10,12})+)(?<yt_video>(?&proto)?(?&yt_domain)+(?:watch)?(?:\/embed\/|\?v=)+(?&yt_hash)+))((?&yt_video).*(*SKIP)(*FAIL)|(?&proto)?(?&subdomain)(?&tld)(?&port)?(?&path)?(?&query)?(?&hash)?)/
 (?i)
 (?(DEFINE)
      # URL
      (?<proto>                                          # (1 start)
           (?: https?: )?
           //
      )                                                  # (1 end)
      (?<port> : [0-9]{2,5} )                            # (2)
      (?<tld>                                            # (3 start)
           (?:
                a
                (?:
                     [cd] 
                  |  e
                     (?: ro )?
                  |  [fgil-oqr] 
                  |  s
                     (?: ia )?
                  |  [tuwxz] 
                )
             |  b
                (?: [abd-h] | iz? | [jl-oq-tvwyz] )
             |  c
                (?:
                     at?
                  |  [cdf-ik-n] 
                  |  o
                     (?: m | op )?
                  |  [ru-z] 
                )
             |  d [ejkmoz] 
             |  e [ceghr-u] 
             |  f [i-kmor] 
             |  g [abd-il-np-uwy] 
             |  h [kmnrtu] 
             |  i
                (?:
                     [delm] 
                  |  n
                     (?: fo | t )?
                  |  [oq-t] 
                )
             |  j
                (?:
                     [em] 
                  |  o
                     (?: bs )?
                  |  p
                )
             |  k [eg-imnprwyz] 
             |  l [a-cikr-vy] 
             |  m
                (?:
                     [ac-hk] 
                  |  lc?
                  |  [mn] 
                  |  o
                     (?: bi )?
                  |  [p-t] 
                  |  u
                     (?: seum )?
                  |  [v-z] 
                )
             |  n
                (?:
                     a
                     (?: me )?
                  |  c
                  |  et?
                  |  [fgilopruz] 
                )
             |  o
                (?: m | rg )
             |  p
                (?: [ae-hk-n] | ost | ro? | [stwy] )
             |  qa
             |  r [eosuw] 
             |  s
                (?:
                     [a-eg-or] 
                  |  t
                     (?: udio )?
                  |  [uvx-z] 
                )
             |  t
                (?:
                     [cd] 
                  |  el
                  |  [f-hj-p] 
                  |  r
                     (?: avel )?
                  |  [tvwz] 
                )
             |  u [agkmsyz] 
             |  v [aceginu] 
             |  w [fs] 
             |  y [et] 
             |  z [amw] 
           )

      )                                                  # (3 end)
      (?<path>                                           # (4 start)
           (                                                  # (5 start)
                /
                (?: [a-z0-9+%-] \.? )+
           )*                                                 # (5 end)
           /?
      )                                                  # (4 end)
      (?<query> \? [a-z+&$_.-] [a-z0-9;:@&%=+/.-]* )     # (6)
      (?<hash> \# [a-z_.-] [a-z0-9+$%_.-]* )             # (7)
      (?<subdomain>                                      # (8 start)
           [a-z0-9\-\.]+ 
           \.
      )                                                  # (8 end)

      # Exceptions
      (?<yt_domain>                                      # (9 start)
           (?: www\. )?
           (?: youtube\.com | youtu\.be )
           /
      )                                                  # (9 end)
      (?<yt_hash>                                        # (10 start)
           (?: [\w-]{10,12} )+
      )                                                  # (10 end)
      (?<yt_video>                                       # (11 start)
           (?&proto)? (?&yt_domain)+ 
           (?: watch )?
           (?: /embed/ | \?v= )+
           (?&yt_hash)+ 
      )                                                  # (11 end)
 )

 # Capture
 (                                                  # (12 start)
      (?&yt_video) .* 
      (*SKIP) (*FAIL) 
   |  
      (?&proto)? 
      (?&subdomain) 
      (?&tld) 
      (?&port)? 
      (?&path)? 
      (?&query)? 
      (?&hash)? 
 )                                                  # (12 end)