在ruby中删除字符串中的空行_Ruby_Regex_String

在ruby中删除字符串中的空行

ruby regex string

在ruby中删除字符串中的空行,ruby,regex,string,Ruby,Regex,String,我也问过其他类似的问题，但它们似乎无法解释我的问题我的输出，现在是这样的，我想在ruby中删除字符串中的空行 # CIRRUS LADIES NIGHT with DJ ROHIT 4th of JULY Party ft. DJ JASMEET @ I-Bar Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening) Champagne Showers - DJs Panic & N

我也问过其他类似的问题，但它们似乎无法解释我的问题

我的输出，现在是这样的，我想在ruby中删除字符串中的空行

#    

CIRRUS LADIES NIGHT with DJ ROHIT

4th of JULY Party ft. DJ JASMEET @ I-Bar

Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)

Champagne Showers - DJs Panic & Nyth @ Blue Waves

THURSDAY PAST AND PRESENT @ Hint

我希望我的输出是这样的

CIRRUS LADIES NIGHT with DJ ROHIT
4th of JULY Party ft. DJ JASMEET @ I-Bar
Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)
Champagne Showers - DJs Panic & Nyth @ Blue Waves
THURSDAY PAST AND PRESENT @ Hint

我试过

gsub/^$\n/，''

，

gsub（/\n/，''）

，

挤压（“\n”）

和

删除！“\n”

无效

另外，我忘了提到我的字符串以一个空行开头，

表示第一行之前的一个空行，如果这会改变什么的话

My String.inspect根据请求，字符串的内容已更改，但问题仍然相同

string.inspect:

"\n\n\t\t\t\t\t\t\t\t\t"
"Tricky Tuesdays with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"Bladder Buster Challenge with DJ Sean @ Star Rock"
"\n\n\t\t\t\t\t\t\t\t\t"
"Classic Rock Tuesday @ 10D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Vodka Night with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"\"BOLLYWOOD WEDNESDAYS\" with DJ D Nash @ Candy Club"
"\n\n\t\t\t\t\t\t\t\t\t"
"RE - LAUNCH WEDNESDAY LADIES NIGHT @ ZODIAC"
"\n\n\t\t\t\t\t\t\t\t\t"
"Ladies Night @ 10 D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Wednesday Mayhem @ Dublin"
"\n\n\t\t\t\t\t\t\t\t\t"

试一试

并替换为空字符串

是否确定换行符仅为
\n
？如果不尝试

/^\r?\n/

也允许断线序列<代码> \r\n

首先，您的代码移除所有新行，而不是空白行，这听起来不像您想要的。其次，操作系统在如何表示新行方面历来存在分歧——旧的Mac使用

\r

表示新行，Linux和OSX使用

\n

，Windows使用组合

\r\n

。因此，您确实希望将连续的

\r

和

\n

替换为单个

\n

以下是我的解决方案：

text.gsub(/\n+|\r+/, "\n").squeeze("\n").strip

这将删除所有连续的空行：

result = s.squeeze("\r\n").gsub(/(\r\n)+/, "\r\n")

或不带Ruby的命令行选项：

grep -v "^$" <file>

grep-v“^$”

根据@Tom的回答，这里有一个丑陋的黑客：

result = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }

它支持DOS（

\r\n

）、Unix（

\n

）和MacOS 9-（

\r

）换行。测试：

[ "\r\n", "\n", "\r" ].each do |marker|
  1.upto(5) do |lines|
    s = "a#{marker*lines}b"
    tight = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }
    puts "%24s -> %s" % [s.inspect, tight.inspect]
  end
end
#=>                 "a\r\nb" -> "a\r\nb"
#=>             "a\r\n\r\nb" -> "a\r\nb"
#=>         "a\r\n\r\n\r\nb" -> "a\r\nb"
#=>     "a\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=> "a\r\n\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=>                   "a\nb" -> "a\nb"
#=>                 "a\n\nb" -> "a\nb"
#=>               "a\n\n\nb" -> "a\nb"
#=>             "a\n\n\n\nb" -> "a\nb"
#=>           "a\n\n\n\n\nb" -> "a\nb"
#=>                   "a\rb" -> "a\rb"
#=>                 "a\r\rb" -> "a\rb"
#=>               "a\r\r\rb" -> "a\rb"
#=>             "a\r\r\r\rb" -> "a\rb"
#=>           "a\r\r\r\r\rb" -> "a\rb"

请注意，这假设您的空行是真正的空白行，并且没有任何空格。如果是这种情况，您可以预先传递

s.gsub（/^[\t]+$/，''）

.split（/\n/）。拒绝{l | l.chomp.empty？}。加入（“\n”）

仅适用于Unix样式：

.split（/\n/）.reject（&:empty？）.join（“\n”）

也删除空白行（Unix，Rails方法）：

.split（/\n/）.reject（&:blank？）.join（“\n”）

这里有一个单独的正则表达式，可以删除所有空行，包括文件开头或结尾的空行，包括只包含空格或制表符的行，并允许所有三种形式的行尾标记（

\r\n

、

\n

和

\r

）：

这可以做到：

.gsub（/（\n\s*\n）+/，“\n”）

如果需要，将正则表达式中的

\n

替换为

[\n\r

。

是否可以替换“\n\n”->“\n”？或者更好的“\n+”-->“\n”？是的，我试过

gsub（“\n+”，”）

和

gsub（/\n\n/，“\n”）

，它们都不起作用。@arvindravi请在你的字符串上发布

.inspect

的结果。你发布的.inspect结果看起来不像通常检查单个字符串的结果。你能在你的对象上发布调用.class的结果吗？@ebeland谢谢你指出。对不起给大家添麻烦了，我太无知和愚蠢了。我不敢相信我忽略了它们是不同的弦！我的错！OS-Linux-Fedora 17我试过

gsub/\r+/，“\r”

和

gsub/\n+/，“\n”

，但仍然没有成功。当文件以文本模式打开时，行尾不是自动转换成

\n

吗？@MatheusMoreira它不是一个文件，我正在构建一个刮刀，它会根据页面的不同生成字符串，所以，我只是想把那些空行/空行去掉。@arvindravi，你确定你是在处理

\n

换行符，而不是

标记吗？@arvindravi Windows对我相信的每一行新行使用

\r\n

，因此模式必须适应以下任何可能性：`gsub/[\r\n]+/，“\n”可以，因为它涵盖了新行表示方式的所有三种可能性

“a\r\n\r\n\r\nb”.square（“\r\n”）.gsub（“\r\n\r\n”，“\r\n”））\r\n=>“a\r\n\r\nb”

这是因为

gsub

在再次搜索时不包括替换结果。您需要类似于

result=s.squence（…）。轻触{s2 |：在s2.gsub！（“\r\n\r\n”，“\r\n”）}

ok有趣，不知道这一点。所以你的意思是我的解决方案不处理Windows行结尾。哇！是，逻辑反转：）

[ "\r\n", "\n", "\r" ].each do |marker|
  1.upto(5) do |lines|
    s = "a#{marker*lines}b"
    tight = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }
    puts "%24s -> %s" % [s.inspect, tight.inspect]
  end
end
#=>                 "a\r\nb" -> "a\r\nb"
#=>             "a\r\n\r\nb" -> "a\r\nb"
#=>         "a\r\n\r\n\r\nb" -> "a\r\nb"
#=>     "a\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=> "a\r\n\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=>                   "a\nb" -> "a\nb"
#=>                 "a\n\nb" -> "a\nb"
#=>               "a\n\n\nb" -> "a\nb"
#=>             "a\n\n\n\nb" -> "a\nb"
#=>           "a\n\n\n\n\nb" -> "a\nb"
#=>                   "a\rb" -> "a\rb"
#=>                 "a\r\rb" -> "a\rb"
#=>               "a\r\r\rb" -> "a\rb"
#=>             "a\r\r\r\rb" -> "a\rb"
#=>           "a\r\r\r\r\rb" -> "a\rb"

def remove_blank_lines( str, line_ending="\n" )
  str.gsub(/(?<=\A|#{line_ending})[ \t]*(?:#{line_ending}|\z)/,'')
end

[ "\r\n", "\n", "\r" ].each do |marker|
    puts '='*70, "Lines ending with: #{marker.inspect}", '='*70
  [ "", " ", "\t", " \t", "\t " ].each do |whitespace|
    0.upto(2) do |lines|
        blank_lines = "#{whitespace}#{marker*lines}"
      s = "#{marker*lines}a#{marker*lines}b#{blank_lines}c#{blank_lines}"
      tight = remove_blank_lines(s, marker)
      puts "%43s -> %s" % [s.inspect, tight.inspect]
    end
  end
end

#=> ======================================================================
#=> Lines ending with: "\r\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                       "\r\na\r\nb\r\nc\r\n" -> "a\r\nb\r\nc\r\n"
#=>       "\r\n\r\na\r\n\r\nb\r\n\r\nc\r\n\r\n" -> "a\r\nb\r\nc\r\n"
#=>                                     "ab c " -> "ab c "
#=>                     "\r\na\r\nb \r\nc \r\n" -> "a\r\nb \r\nc \r\n"
#=>     "\r\n\r\na\r\n\r\nb \r\n\r\nc \r\n\r\n" -> "a\r\nb \r\nc \r\n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                   "\r\na\r\nb\t\r\nc\t\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>   "\r\n\r\na\r\n\r\nb\t\r\n\r\nc\t\r\n\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                 "\r\na\r\nb \t\r\nc \t\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=> "\r\n\r\na\r\n\r\nb \t\r\n\r\nc \t\r\n\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                 "\r\na\r\nb\t \r\nc\t \r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> "\r\n\r\na\r\n\r\nb\t \r\n\r\nc\t \r\n\r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> ======================================================================
#=> Lines ending with: "\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\na\nb\nc\n" -> "a\nb\nc\n"
#=>                       "\n\na\n\nb\n\nc\n\n" -> "a\nb\nc\n"
#=>                                     "ab c " -> "ab c "
#=>                             "\na\nb \nc \n" -> "a\nb \nc \n"
#=>                     "\n\na\n\nb \n\nc \n\n" -> "a\nb \nc \n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\na\nb\t\nc\t\n" -> "a\nb\t\nc\t\n"
#=>                   "\n\na\n\nb\t\n\nc\t\n\n" -> "a\nb\t\nc\t\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\na\nb \t\nc \t\n" -> "a\nb \t\nc \t\n"
#=>                 "\n\na\n\nb \t\n\nc \t\n\n" -> "a\nb \t\nc \t\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\na\nb\t \nc\t \n" -> "a\nb\t \nc\t \n"
#=>                 "\n\na\n\nb\t \n\nc\t \n\n" -> "a\nb\t \nc\t \n"
#=> ======================================================================
#=> Lines ending with: "\r"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\ra\rb\rc\r" -> "a\rb\rc\r"
#=>                       "\r\ra\r\rb\r\rc\r\r" -> "a\rb\rc\r"
#=>                                     "ab c " -> "ab c "
#=>                             "\ra\rb \rc \r" -> "a\rb \rc \r"
#=>                     "\r\ra\r\rb \r\rc \r\r" -> "a\rb \rc \r"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\ra\rb\t\rc\t\r" -> "a\rb\t\rc\t\r"
#=>                   "\r\ra\r\rb\t\r\rc\t\r\r" -> "a\rb\t\rc\t\r"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\ra\rb \t\rc \t\r" -> "a\rb \t\rc \t\r"
#=>                 "\r\ra\r\rb \t\r\rc \t\r\r" -> "a\rb \t\rc \t\r"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\ra\rb\t \rc\t \r" -> "a\rb\t \rc\t \r"
#=>                 "\r\ra\r\rb\t \r\rc\t \r\r" -> "a\rb\t \rc\t \r"