Javascript(jQuery)删除长文本的最后一句话

Javascript(jQuery)删除长文本的最后一句话,javascript,jquery,sentence,Javascript,Jquery,Sentence,我正在寻找一个javascript函数,它足够聪明,可以删除一长段文本的最后一句话(实际上是一段)。显示复杂性的一些示例文本: <p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the sentence any harder! I looked up the wind

我正在寻找一个javascript函数,它足够聪明,可以删除一长段文本的最后一句话(实际上是一段)。显示复杂性的一些示例文本:

<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the sentence any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said. He later described it as: "Something insane."</p>
如何做到这一点?什么是正确的算法


编辑-长文本指的是我段落中的所有内容,而句子指的是一个实际的句子(不是一行),因此在我的例子中,最后一句是:
他后来将其描述为:“疯狂的东西。”
当那一句被删除时,下一句是
她不知道,“我认为我们应该越过栅栏!”,她很快说:“这是一个很好的例子。为什么不创建一个临时变量,将所有的“!”和“?”转换成“.”,拆分该临时变量,删除最后一句,将该临时数组合并成一个字符串,并计算其长度?然后将原始段落的子字符串增加到该长度,定义规则:
//1.句子以大写字母开头
//2.句子前面不带任何内容或[.!?],但不带[,:;]
//3.如果格式不正确,句子前面可以加引号,例如[“]”
// 4. 在这种情况下,如果引号后面的单词是名称,则句子可能不正确

还有其他规定吗

确定你的目的: // 1. 删除最后一句话

假设: 如果您从文本字符串中的最后一个字符开始并向后操作,那么您可以将句子的开头标识为: 1.字符前的文本字符串为[.?!]或 2.字符前的文本字符串为[“”],前面是大写字母 3.每个[.]前面都有空格 4.我们没有更正html标记 5.这些假设不可靠,需要定期调整

可能的解决办法: 读入您的字符串,并将其拆分为空格字符,以便我们以相反的方式查看字符串块

var characterGroups = $('#this-paragraph').html().split(' ').reverse();
如果您的字符串为:

布拉布拉,这里有更多的文字。有时会使用基本的html代码,但这不应该使句子的“选择”变得更难!我抬头看了看窗户,看到一架飞机飞过。我问第一个想到的问题:“它在上面干什么?”她不知道,“我想我们应该越过栅栏!”她很快说。他后来把它描述为:“某种疯狂的东西。”

注意:将使用jQuery中的.text()方法删除“”标记和其他标记

每个块后面都有一个空格,所以当我们确定了句子的起始位置(通过数组索引)后,我们就知道了空格有什么索引,我们可以在空格从句子末尾开始占据索引的位置拆分原始字符串

console.log(characterGroups[index]); // He at index=6
给我们自己一个变量来标记我们是否找到了它,并给我们自己一个变量来保存数组元素的索引位置,我们将其标识为保存最后一句的开头:

var found = false;
var index = null;
在数组中循环查找以[.!?]结尾或以“where”结尾的元素,其中前一个元素以大写字母开头

var position     = 1,//skip the first one since we know that's the end anyway
    elements     = characterGroups.length,
    element      = null,
    prevHadUpper = false,
    last         = null;

while(!found && position < elements) {
    element = characterGroups[position].split('');

    if(element.length > 0) {
       last = element[element.length-1];

       // test last character rule
       if(
          last=='.'                      // ends in '.'
          || last=='!'                   // ends in '!'
          || last=='?'                   // ends in '?'
          || (last=='"' && prevHadUpper) // ends in '"' and previous started [A-Z]
       ) {
          found = true;
          index = position-1;
          lookFor = last+' '+characterGroups[position-1];
       } else {
          if(element[0] == element[0].toUpperCase()) {
             prevHadUpper = true;
          } else {
             prevHadUpper = false;
          }
       }
    } else {
       prevHadUpper = false;
    }
    position++;
}
现在,您可以运行以前的字符串:

var trimPosition = originalString.lastIndexOf(lookFor)+1;
var updatedString = originalString.substr(0,trimPosition);
console.log(updatedString);

// Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the sentence any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said.
var trimPosition=originalString.lastIndexOf(lookFor)+1;
var updatedString=originalString.substr(0,trimPosition);
log(updatedString);
//布拉布拉,这里有更多的文字。有时会使用基本的html代码,但这不会使句子的“选择”变得更加困难!我抬头看了看窗户,看见一架飞机飞过。我第一个想到的问题是:“它在上面干什么?”她不知道,“我想我们应该越过栅栏!”她很快说。
再次运行它并获取: 布拉布拉,这里有更多的文字。有时会使用基本的html代码,但这不会使句子的“选择”变得更加困难!我抬头看了看窗户,看见一架飞机飞过。我首先想到的问题是:“它在上面干什么?”

再次运行它并获取: 布拉布拉,这里有更多的文字。有时会使用基本的html代码,但这不会使句子的“选择”变得更加困难!我抬头看了看窗户,看见一架飞机飞过

再次运行它并获取: 布拉布拉,这里有更多的文字。有时会使用基本的html代码,但这不会使句子的“选择”变得更加困难

再次运行它并获取: 布拉布拉,这里有更多的文字

再次运行它并获取: 布拉布拉,这里有更多的文字

所以,我认为这符合你想要的

作为一项功能:

function trimSentence(string){
    var found = false;
    var index = null;

    var characterGroups = string.split(' ').reverse();

    var position     = 1,//skip the first one since we know that's the end anyway
        elements     = characterGroups.length,
        element      = null,
        prevHadUpper = false,
        last         = null,
        lookFor      = '';

    while(!found && position < elements) {
        element = characterGroups[position].split('');

        if(element.length > 0) {
           last = element[element.length-1];

           // test last character rule
           if(
              last=='.' ||                // ends in '.'
              last=='!' ||                // ends in '!'
              last=='?' ||                // ends in '?'
              (last=='"' && prevHadUpper) // ends in '"' and previous started [A-Z]
           ) {
              found = true;
              index = position-1;
              lookFor = last+' '+characterGroups[position-1];
           } else {
              if(element[0] == element[0].toUpperCase()) {
                 prevHadUpper = true;
              } else {
                 prevHadUpper = false;
              }
           }
        } else {
           prevHadUpper = false;
        }
        position++;
    }


    var trimPosition = string.lastIndexOf(lookFor)+1;
    return string.substr(0,trimPosition);
}
函数微调语句(字符串){
var=false;
var指数=零;
var characterGroups=string.split(“”).reverse();
var position=1,//跳过第一个,因为我们知道这就是结束
元素=characterGroups.length,
元素=空,
prevHadUpper=false,
last=null,
查找=“”;
while(!found&&position<元素){
元素=字符组[位置]。拆分(“”);
如果(element.length>0){
last=元素[元素长度-1];
//测试最后一个字符规则
如果(
last='。| 124;//以'.'结尾
last=='!'|///以'!'结尾
last==“?”|///以“?”结尾
(last=='''&&prevhadhupper)//以'''结束,上一个开始[A-Z]
) {
发现=真;
指数=位置-1;
查找=最后一个+''+字符组[位置-1];
}否则{
if(元素[0]==元素[0].toUpperCase()){
prevHadUpper=真;
}否则{
prevhadhupper=false;
}
}
}否则{
prevhadhupper=false;
}
位置++;
}
var trimPosition=string.lastIndexOf(lookFor)+1;
返回字符串.substr(0,trimPosition);
}
如果需要的话,为它制作一个插件是很简单的,但是要注意这些假设!)

这有用吗

谢谢, AE

应该这样做

/*
Assumptions:
- Sentence separators are a combination of terminators (.!?) + doublequote (optional) + spaces + capital letter. 
- I haven't preserved tags if it gets down to removing the last sentence. 
*/
function removeLastSentence(text) {

    lastSeparator = Math.max(
        text.lastIndexOf("."), 
        text.lastIndexOf("!"), 
        text.lastIndexOf("?")
    );

    revtext = text.split('').reverse().join('');
    sep = revtext.search(/[A-Z]\s+(\")?[\.\!\?]/); 
    lastTag = text.length-revtext.search(/\/\</) - 2;

    lastPtr = (lastTag > lastSeparator) ? lastTag : text.length;

    if (sep > -1) {
        text1 = revtext.substring(sep+1, revtext.length).trim().split('').reverse().join('');
        text2 = text.substring(lastPtr, text.length).replace(/['"]/g,'').trim();

        sWithoutLastSentence = text1 + text2;
    } else {
        sWithoutLastSentence = '';
    }
    return sWithoutLastSentence;
}

/*
TESTS: 

var text = '<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the text any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said. He later described it as: "Something insane. "</p>';

alert(text + '\n\n' + removeLastSentence(text));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(text)));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(text))));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text)))));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text))))));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text)))))));
alert(text + '\n\n' + removeLastSentence('<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the text any harder! I looked up the '));
*/
var trimPosition = originalString.lastIndexOf(lookFor)+1;
var updatedString = originalString.substr(0,trimPosition);
console.log(updatedString);

// Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the sentence any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said.
function trimSentence(string){
    var found = false;
    var index = null;

    var characterGroups = string.split(' ').reverse();

    var position     = 1,//skip the first one since we know that's the end anyway
        elements     = characterGroups.length,
        element      = null,
        prevHadUpper = false,
        last         = null,
        lookFor      = '';

    while(!found && position < elements) {
        element = characterGroups[position].split('');

        if(element.length > 0) {
           last = element[element.length-1];

           // test last character rule
           if(
              last=='.' ||                // ends in '.'
              last=='!' ||                // ends in '!'
              last=='?' ||                // ends in '?'
              (last=='"' && prevHadUpper) // ends in '"' and previous started [A-Z]
           ) {
              found = true;
              index = position-1;
              lookFor = last+' '+characterGroups[position-1];
           } else {
              if(element[0] == element[0].toUpperCase()) {
                 prevHadUpper = true;
              } else {
                 prevHadUpper = false;
              }
           }
        } else {
           prevHadUpper = false;
        }
        position++;
    }


    var trimPosition = string.lastIndexOf(lookFor)+1;
    return string.substr(0,trimPosition);
}
/*
Assumptions:
- Sentence separators are a combination of terminators (.!?) + doublequote (optional) + spaces + capital letter. 
- I haven't preserved tags if it gets down to removing the last sentence. 
*/
function removeLastSentence(text) {

    lastSeparator = Math.max(
        text.lastIndexOf("."), 
        text.lastIndexOf("!"), 
        text.lastIndexOf("?")
    );

    revtext = text.split('').reverse().join('');
    sep = revtext.search(/[A-Z]\s+(\")?[\.\!\?]/); 
    lastTag = text.length-revtext.search(/\/\</) - 2;

    lastPtr = (lastTag > lastSeparator) ? lastTag : text.length;

    if (sep > -1) {
        text1 = revtext.substring(sep+1, revtext.length).trim().split('').reverse().join('');
        text2 = text.substring(lastPtr, text.length).replace(/['"]/g,'').trim();

        sWithoutLastSentence = text1 + text2;
    } else {
        sWithoutLastSentence = '';
    }
    return sWithoutLastSentence;
}

/*
TESTS: 

var text = '<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the text any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said. He later described it as: "Something insane. "</p>';

alert(text + '\n\n' + removeLastSentence(text));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(text)));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(text))));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text)))));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text))))));
alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text)))))));
alert(text + '\n\n' + removeLastSentence('<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the text any harder! I looked up the '));
*/