Css 尝试从正则表达式结果中删除十六进制代码_Css_Regex_Filter

Css 尝试从正则表达式结果中删除十六进制代码

css regex filter

Css 尝试从正则表达式结果中删除十六进制代码,css,regex,filter,Css,Regex,Filter,我在这里的第一个问题是切中要害说到正则表达式，我是个新手。为了更好地学习它并创建一些我可以实际使用的东西，我正在尝试创建一个regexp，它将在CSS文件中找到所有CSS标记。到目前为止，我正在使用： [#.]([a-zA-Z0-9_\-])* 它工作得很好，可以找到#TB_窗口以及#TB_窗口img#TB_图像和.TB#u图像#TB_窗口问题是它还可以在CSS文件中找到十六进制代码标记。ie#FFF或#eaeaea 也可以找到.png或.jpg或0.75 事实上，它们被发现是非常

我在这里的第一个问题是
切中要害

说到正则表达式，我是个新手。
为了更好地学习它并创建一些我可以实际使用的东西，我正在尝试创建一个regexp，它将在CSS文件中找到所有CSS标记。

到目前为止，我正在使用：

[#.]([a-zA-Z0-9_\-])*

它工作得很好，可以找到

#TB_窗口

以及

#TB_窗口img#TB_图像

和

.TB#u图像#TB_窗口

问题是它还可以在CSS文件中找到十六进制代码标记。ie

#FFF

或

#eaeaea

也可以找到.png或.jpg或0.75

事实上，它们被发现是非常合乎逻辑的，但是没有聪明的解决办法吗？
是否排除括号

{..}

之间的任何内容？
（我很确定这是可能的，但我的regexp经验还不多）

提前谢谢

干杯

迈克

首先，我不知道你发布的RE会如何找到

.TB#u图像#TB#u窗口

。你可以这样做：

/^[#\.]([a-zA-Z0-9_\-]*)\s*{?\s*$/

/([a-zA-Z0-9_\-]*([#\.][a-zA-Z0-9_\-]+)+\s*,?\s*)+{.*}/

这将在行首找到任何出现的

或

，后跟标记，可选地后跟

，然后换行

请注意，这不适用于

.TB_Image{something:0；}

（全部在一行）或

div.mydivclass

之类的行，因为

不在行的开头

编辑：我认为CSS中不允许嵌套大括号，因此如果您读入所有数据并去掉换行符，您可以执行以下操作：

/^[#\.]([a-zA-Z0-9_\-]*)\s*{?\s*$/

/([a-zA-Z0-9_\-]*([#\.][a-zA-Z0-9_\-]+)+\s*,?\s*)+{.*}/

有一种方法可以告诉正则表达式也忽略换行符，但我似乎从来没有做到这一点。

这是怎么回事：

([#.]\S+\s*,?)+(?=\{)

事实上，使用正则表达式解决这一问题并不容易，因为有很多可能性，请考虑：

后代选择器，如
```
#someid ul img
```
——这些都是有效的标记，用空格分隔
不以
或
```
#
```
开头的标记（即HTML标记名）——您必须提供这些标记的列表以匹配它们，因为它们与属性没有其他区别
评论
更多我现在想不起来的

我认为你应该考虑一些适合你的首选语言的CSS解析库。

< P> CSS是一种非常简单的规则语言，这意味着它可以完全由正则表达式解析。所有的都是选择器组，每一组后面跟着一组由冒号分隔的选项。

请注意，本文中的所有正则表达式都应该设置verbose和dotall标志（/s和/x在某些语言中，re.dotall和re.verbose在Python中）

要获取（选择器、规则）对，请执行以下操作：

在属性选择器（例如

img[src~='{abc}']

）或规则（例如

background:url（'images/ab{c}.jpg'）

）中使用带引号的花括号的罕见情况下，这将不起作用。这可以通过将正则表达式复杂化一些来解决：

\s*        # Match any initial space
((?:       # Start the selectors capture group.
  [^{}\"\']           # Any character other than braces or quotes.
  |                   # OR
  \"                  # An opening double quote.
    (?:[^\"\\]|\\.)*  # Either a neither-quote-not-backslash, or an escaped character.
  \"                  # And a closing double quote.
  |                   # OR
  \'(?:[^\']|\\.)*\'  # Same as above, but for single quotes.
)+?)       # Ungreedily match all that once or more.
\s*        # Arbitrary spacing again.
\{         # Opening brace.
  \s*      # Arbitrary spacing again.
  ((?:[^{}\"\']|\"(?:[^\"\\]|\\.)*\"|\'(?:[^\'\\]|\\.)*\')*?)
           # The above line is the same as the one in the selector capture group.
  \s*      # Arbitrary spacing again.
\}         # Closing brace.
# This will even correctly identify escaped quotes.

哇，这是一个小数目。但是如果你以模块化的方式来处理它，你会发现它并不像乍看起来那么复杂

现在，要拆分选择器和规则，我们必须匹配非分隔符（其中分隔符是选择器的逗号，分号是规则）或带引号的字符串中的任何内容。我们将使用上面使用的相同模式

对于选择器：

\s*        # Match any initial space
((?:       # Start the selectors capture group.
  [^,\"\']             # Any character other than commas or quotes.
  |                    # OR
  \"                   # An opening double quote.
    (?:[^\"\\]|\\.)*   # Either a neither-quote-not-backslash, or an escaped character.
  \"                   # And a closing double quote.
  |                    # OR
  \'(?:[^\'\\]|\\.)*\' # Same as above, but for single quotes.
)+?)       # Ungreedily match all that.
\s*        # Arbitrary spacing.
(?:,|$)      # Followed by a comma or the end of a string.

关于规则：

\s*        # Match any initial space
((?:       # Start the selectors capture group.
  [^,\"\']             # Any character other than commas or quotes.
  |                    # OR
  \"                   # An opening double quote.
    (?:[^\"\\]|\\.)*   # Either a neither-quote-not-backslash, or an escaped character.
  \"                   # And a closing double quote.
  |                    # OR
  \'(?:[^\'\\]|\\.)*\' # Same as above, but for single quotes.
)+?)       # Ungreedily match all that.
\s*        # Arbitrary spacing.
(?:;|$)      # Followed by a semicolon or the end of a string.

最后，对于每个规则，我们可以在冒号上拆分（一次！）以获得属性值对

将所有这些放在一个Python程序中（正则表达式与上面相同，但为了节省空间而不冗长）：

对于此示例CSS：

body, p#abc, #cde, a img .fgh, * {
  font-size: normal; background-color: white !important;

  -webkit-box-shadow: none
}

#test[src~='{a\'bc}'], .tester {
  -webkit-transition: opacity 0.35s linear;
  background: white !important url("abc\"cd'{e}.jpg");
  border-radius: 20px;
  opacity: 0;
  -webkit-box-shadow: rgba(0, 0, 0, 0.6) 0px 0px 18px;
}

span {display: block;} .nothing{}

…我们得到（为清晰起见，间隔）：

读者的简单练习：编写一个正则表达式来删除CSS注释（

/*…*/

）

[(['body',
   'p#abc',
   '#cde',
   'a img .fgh',
   '*'],
  [['font-size', 'normal'],
   ['background-color', 'white !important'],
   ['-webkit-box-shadow', 'none']]),
 (["#test[src~='{a\\'bc}']",
   '.tester'],
  [['-webkit-transition', 'opacity 0.35s linear'],
   ['background', 'white !important url("abc\\"cd\'{e}.jpg")'],
   ['border-radius', '20px'],
   ['opacity', '0'],
   ['-webkit-box-shadow', 'rgba(0, 0, 0, 0.6) 0px 0px 18px']]),
 (['span'],
  [['display', 'block']]),
 (['.nothing'],
  [])]