Ruby没有“任何”功能；（开始、结束）之间的字符串“子字符串”；，我应该用什么？_Ruby_Substring

Ruby没有“任何”功能；（开始、结束）之间的字符串“子字符串”；，我应该用什么？

ruby

Ruby没有“任何”功能；（开始、结束）之间的字符串“子字符串”；，我应该用什么？,ruby,substring,Ruby,Substring,我有一个非常复杂的字符串，例如： aaa ABCD@@@EFG^&*))*T*^[][][] bbb ABCD@@@EFG^&*))*T*^[][][] ccc ABCD@@@EFG^&*))*T*^[

我有一个非常复杂的字符串，例如：

<p>aaa <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>
<p>bbb <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>
<p>ccc <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>
....

有这样的方法吗？或者最好的方法是什么？

使用

string='

Hi'
strip_标记（字符串）#将返回“Hi”

使用

string='

Hi'
strip_标记（字符串）#将返回“Hi”

我认为您必须自己构建函数。比如：

def substrings_between str, opening, ending
  i_opening = str.index opening
  i_ending = str.index ending
  res = []
  while i_opening && i_ending
    res << str[i_opening+opening.length .. i_ending]
    str = str[i_ending+ending.length .. -1]
    i_opening = str.index opening
    i_ending = str.index ending
  end
  res
end

str、开始、结束之间的def子字符串 i_开口=str索引开口 i_end=str.index end res=[] 当我开始和结束时

res我认为您必须自己构建函数。比如：

def substrings_between str, opening, ending
  i_opening = str.index opening
  i_ending = str.index ending
  res = []
  while i_opening && i_ending
    res << str[i_opening+opening.length .. i_ending]
    str = str[i_ending+ending.length .. -1]
    i_opening = str.index opening
    i_ending = str.index ending
  end
  res
end

str、开始、结束之间的def子字符串 i_开口=str索引开口 i_end=str.index end res=[] 当我开始和结束时

res我认为您正在寻找的函数可能过于具体，无法在Ruby发行版中使用

我们或许可以使用

String#index(string, offset)

然后我们可以这样写（扩展字符串）：

类字符串
def分隔的字符串（开始、结束）
字符串=[]
开始时间=索引（开始时间）
返回字符串，除非从处开始
ends_at=索引（end_delim，start_at+start_delim.size）
而在do处开始和结束
字符串[“aaa”、“bbb”、“ccc”]

我认为您正在寻找的函数可能过于具体，无法在Ruby发行版中使用

我们或许可以使用

String#index(string, offset)

然后我们可以这样写（扩展字符串）：

类字符串
def分隔的字符串（开始、结束）
字符串=[]
开始时间=索引（开始时间）
返回字符串，除非从处开始
ends_at=索引（end_delim，start_at+start_delim.size）
而在do处开始和结束
字符串[“aaa”、“bbb”、“ccc”]

以下方法将完成此工作

def substring_between(target, match1, match2)
  start_match1 = target.index(match1)
  if start_match1 && start_match2 = target.index(match2, start_match1 + match1.length)
    start_idx = start_match1 + match1.length
    target[start_idx, start_match2 - start_idx]
  else
    nil
  end
end

如果您想在string类上创建一个实例方法，那么这应该适合您

class String
  def substring_between(sub1, sub2)
    match1 = self.index(sub1)
    if match1 && match2 = self.index(sub2, match1 + sub1.length)
      idx = match1 + sub1.length
      self[idx, match2 - idx]
    else
      nil
    end
  end
end

如果开始或结束标记不存在或顺序错误，两种实现都返回nil。下面的测试脚本和结果显示它工作正常

strings = [
'No tags at all',
'<font End tag before start tag <p>',
'<p>End tag at end <font',
'No start tag <font',
'<p>No end tag',
'<p>aaa <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>',
'    <p>bbb <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>',
'<p>ccc     cccc<font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>'
]

strings.each do |s|
  puts "Method Test = #{s} Result: |#{substring_between(s, '<p>', '<font')}|"
  puts "String Test = #{s} Result: |#{s.substring_between('<p>', '<font')}|"
end

字符串=[
'完全没有标签'，
“以下方法将完成此任务
def substring_between(target, match1, match2)
  start_match1 = target.index(match1)
  if start_match1 && start_match2 = target.index(match2, start_match1 + match1.length)
    start_idx = start_match1 + match1.length
    target[start_idx, start_match2 - start_idx]
  else
    nil
  end
end

如果您想在string类上创建一个实例方法，那么这应该适合您
class String
  def substring_between(sub1, sub2)
    match1 = self.index(sub1)
    if match1 && match2 = self.index(sub2, match1 + sub1.length)
      idx = match1 + sub1.length
      self[idx, match2 - idx]
    else
      nil
    end
  end
end

如果开始或结束标记不存在或顺序错误，两个实现都返回nil
strings = [
'No tags at all',
'<font End tag before start tag <p>',
'<p>End tag at end <font',
'No start tag <font',
'<p>No end tag',
'<p>aaa <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>',
'    <p>bbb <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>',
'<p>ccc     cccc<font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>'
]

strings.each do |s|
  puts "Method Test = #{s} Result: |#{substring_between(s, '<p>', '<font')}|"
  puts "String Test = #{s} Result: |#{s.substring_between('<p>', '<font')}|"
end

字符串=[
'完全没有标签'，
“理想情况下，您应该使用适当的解析器解析HTML，如
也就是说，如果您确定所需的内容位于两个硬编码字符串之间，则可以使用scan和正则表达式：
string = '<p>aaa <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>
          <p>bbb <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>
          <p>ccc <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>'

before = Regexp.escape '<p>'
after  = Regexp.escape ' <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>'

substrings = string.scan(/#{before}(.*?)#{after}/).flatten
 => ["aaa", "bbb", "ccc"] 

string='aaa ABCD@@@EFG^&*）*T*^[][]
bbb ABCD@@@EFG^&*）*T*^[][]
ccc ABCD@@@EFG^&*）*T*^[][][][]'
before=Regexp.escape“”
after=Regexp.escape'ABCD@@@EFG^&*）*T*^[][][]'
substring=string.scan（/#{before}（.*？{after}/）.flatte
=>[“aaa”、“bbb”、“ccc”]
理想情况下，您应该使用适当的解析器解析HTML，如
也就是说，如果您确定所需的内容位于两个硬编码字符串之间，则可以使用scan和正则表达式：
string = '<p>aaa <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>
          <p>bbb <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>
          <p>ccc <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>'

before = Regexp.escape '<p>'
after  = Regexp.escape ' <font style="color:red">ABCD@@@EFG^&*))*T*^[][][]</p>'

substrings = string.scan(/#{before}(.*?)#{after}/).flatten
 => ["aaa", "bbb", "ccc"] 

string='aaa ABCD@@@EFG^&*）*T*^[][]
bbb ABCD@@@EFG^&*）*T*^[][]
ccc ABCD@@@EFG^&*）*T*^[][][][]'
before=Regexp.escape“”
after=Regexp.escape'ABCD@@@EFG^&*）*T*^[][][]'
substring=string.scan（/#{before}（.*？{after}/）.flatte
=>[“aaa”、“bbb”、“ccc”]
你使用的是字体
标签，它早就应该消失和隐藏了，使用的是内联样式而不是CSS类，拼写错误的样式
属性，都在同一行？我想做的是从网页上得到一些信息。这就是该网页的内容。@Mike，谢谢你的编辑。我的拼写：）你是e使用字体
标记，这在很久以前就应该消失和隐藏，使用内联样式而不是CSS类，拼写错误样式
属性，所有这些都在同一行？我想做的是从网页上得到一些消息。这就是该网页的内容。@Mike，谢谢你的编辑。我的打字：）谢谢。我的任务是这比这要难得多，所以我想要的不仅仅是去除标签，而是获取一些关键字之间的子字符串。谢谢。我的任务比这更难，所以我想要的不仅仅是去除标签，而是获取一些关键字之间的子字符串。哦！我忘了Regexp#escape
！哦！我忘了Regexp#escape
！谢谢你的帮助详细的回答，你给了我有用的技能。虽然我想这里是“子字符串之间”，但感谢所有samethanks的详细回答，你给了我有用的技能。虽然我想这里是“子字符串之间”，但谢谢大家。有一个小错误：i_ending=str.index ending
应该是str.index ending，i_opening+opening.length
@Freewind，你是什么意思？代码似乎对我有用，将i_ending=str.index ending更改为str.index ending，i_opening+opening.length会出错（我不明白您的意图）。请尝试在“abcba”、“b”、“a”之间使用子字符串，\u，结果是[”，“cba”]
。我认为正确的结果应该是[“cb”]
谢谢。有一个小错误：i_end=str.index ending
应该是str.index ending，i_opening+opening.length
@Freewind，你是什么意思？代码看起来对我有用，而将i_ending=str.index ending更改为str.index ending，i_opening+opening.length会产生错误（我不明白你的意图）。请尝试在“abcba”、“b”、“a”之间使用子字符串，结果是[，“cba”]
。我认为正确的结果应该是[“cb”]
谢谢！有小错误吗？在=索引处结束