PHP substr()函数,该函数允许您设置开始点和停止点,并保持HTML格式?

PHP substr()函数,该函数允许您设置开始点和停止点,并保持HTML格式?,php,html,formatting,split,substr,Php,Html,Formatting,Split,Substr,使用PHP中的普通substr()函数,您可以决定从何处开始剪切字符串,以及设置长度。长度可能是使用最多的,但在本例中,我需要从开头删去大约120个字符。问题是我需要保持字符串中的html完好无损,并且只剪切标记中的实际文本 我为它找到了一些自定义函数,但我还没有找到一个允许您设置起点的函数,例如,您希望从哪里开始剪切字符串 我发现了一个: 因此,我基本上需要一个substr()函数,它的工作原理与原始函数完全相同,只是保留了格式 有什么建议吗 要修改的示例内容: <p>Contra

使用PHP中的普通
substr()
函数,您可以决定从何处开始剪切字符串,以及设置长度。长度可能是使用最多的,但在本例中,我需要从开头删去大约120个字符。问题是我需要保持字符串中的html完好无损,并且只剪切标记中的实际文本

我为它找到了一些自定义函数,但我还没有找到一个允许您设置起点的函数,例如,您希望从哪里开始剪切字符串

我发现了一个:

因此,我基本上需要一个
substr()
函数,它的工作原理与原始函数完全相同,只是保留了格式

有什么建议吗

要修改的示例内容:

<p>Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going <a href="#">through the cites</a> of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus</p> <p>Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the <strong>Renaissance</strong>. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.</p>
与流行的观点相反,Lorem Ipsum不仅仅是随机文本。它起源于公元前45年的一段古典拉丁文学,距今已有2000多年的历史。弗吉尼亚州汉普顿悉尼学院的拉丁语教授理查德·麦克林托克(Richard McClintock)从一段《洛伦·伊普桑》(Lorem Ipsum)中查找了一个更为晦涩的拉丁语单词,即“Concertetur”,并在古典文学中发现了这个词的来源,这是毋庸置疑的。Lorem Ipsum来自西塞罗于公元前45年写的《德菲尼布斯》(de Finibus)的第1.10.32节和第1.10.33节。这本书是关于伦理学理论的论文,在文艺复兴时期非常流行。Lorem Ipsum的第一行“Lorem Ipsum dolor sit amet..”来自第1.10.32节中的一行

从一开始切断5后:

<p>ary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going <a href="#">through the cites</a> of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus</p> <p>Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the <strong>Renaissance</strong>. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.</p>
根据流行的观点,Lorem Ipsum不仅仅是随机文本。它起源于公元前45年的一段古典拉丁文学,距今已有2000多年的历史。弗吉尼亚州汉普顿悉尼学院的拉丁语教授理查德·麦克林托克(Richard McClintock)从一段《洛伦·伊普桑》(Lorem Ipsum)中查找了一个更为晦涩的拉丁语单词,即“Concertetur”,并在古典文学中发现了这个词的来源,这是毋庸置疑的。Lorem Ipsum来自西塞罗于公元前45年写的《德菲尼布斯》(de Finibus)的第1.10.32节和第1.10.33节。这本书是关于伦理学理论的论文,在文艺复兴时期非常流行。Lorem Ipsum的第一行“Lorem Ipsum dolor sit amet..”来自第1.10.32节中的一行

开头和结尾各5个:

<p>ary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going <a href="#">through the cites</a> of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus</p> <p>Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the <strong>Renaissance</strong>. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.1</p>
根据流行的观点,Lorem Ipsum不仅仅是随机文本。它起源于公元前45年的一段古典拉丁文学,距今已有2000多年的历史。弗吉尼亚州汉普顿悉尼学院的拉丁语教授理查德·麦克林托克(Richard McClintock)从一段《洛伦·伊普桑》(Lorem Ipsum)中查找了一个更为晦涩的拉丁语单词,即“Concertetur”,并在古典文学中发现了这个词的来源,这是毋庸置疑的。Lorem Ipsum来自西塞罗于公元前45年写的《德菲尼布斯》(de Finibus)的第1.10.32节和第1.10.33节。这本书是关于伦理学理论的论文,在文艺复兴时期非常流行。Lorem Ipsum的第一行“Lorem Ipsum dolor sit amet..”来自第1.1节中的一行

是的,你明白我的意思了吗

我更希望如果它能切断整个单词的中间部分,但它不是超级重要的。
**编辑:**固定引号。

如果不是较长的文本(因为运行时),您可以尝试使用此引号

但在这种情况下,我需要从开头删去大约120个字符

正是这样做的。输入您的文本或从某处抓取文本,然后输入文本将从开头删除的字符数

请强调:这是一个针对短字符串的解决方案,不是最好的方法,但它是一个完整的工作代码示例

<?php
$text = "<a href='blablabla'>m</a>ylinks...<b>not this code is working</b>......";
$newtext = "";
$delete = 13;
$tagopen = false;

while ($text != ""){
    $checktag=$text[0];
    $text=substr( $text, 1 );
    if ($checktag =="<" || $tagopen == TRUE){
        $newtext .= $checktag;
        if ($checktag == ">"){
        $tagopen = FALSE;
        }
        else{
        $tagopen = TRUE;
        }
    }
    elseif ($delete > 0){   
        $delete = $delete -1 ;
        }
    else
    {
    $newtext .= $checktag;

    }
}
echo $newtext;



?>

它返回:

<a href='blablabla'></a><b> this code is working</b>......
此代码正在运行。。。。。。

这里是一个开始,利用
DOMDocument
(一个xml/html解析器)、
RecursiveIteratorIterator
(便于遍历递归结构)和自定义
DOMNodeList
迭代器实现,更好地使用
RecursiveIteratorIterator

它仍然非常松散(不返回副本,但作用于
DOMNode
/
DOMDocument
),并且它没有常规
substr()
的奇特功能,例如
$start
和/或
$length
的负值,但到目前为止,它似乎完成了这项工作。不过,我相信还是有漏洞的。但它应该会让您了解如何使用
DOMDocument
来实现这一点

自定义迭代器:

class DOMNodeListIterator
    implements Iterator
{
    protected $domNodeList;

    protected $position;

    public function __construct( DOMNodeList $domNodeList )
    {
        $this->domNodeList = $domNodeList;
        $this->rewind();
    }

    public function valid()
    {
        return $this->position < $this->domNodeList->length;
    }

    public function next()
    {
        $this->position++;
    }

    public function key()
    {
        return $this->position;
    }

    public function rewind()
    {
        $this->position = 0;
    }

    public function current()
    {
        return $this->domNodeList->item( $this->position );
    }
}

class RecursiveDOMNodeListIterator
    extends DOMNodeListIterator
    implements RecursiveIterator
{
    public function hasChildren()
    {
        return $this->current()->hasChildNodes();
    }

    public function getChildren()
    {
        return new self( $this->current()->childNodes );
    }
}
类域节点编辑器
实现迭代器
{
受保护的$domNodeList;
受保护的美元头寸;
公共函数构造(DOMNodeList$DOMNodeList)
{
$this->domNodeList=$domNodeList;
$this->revind();
}
公共函数有效()
{
返回$this->position<$this->domNodeList->length;
}
公共职能下一步()
{
$this->position++;
}
公共函数密钥()
{
返回$this->position;
}
公共函数倒带()
{
$this->position=0;
}
公共职能(当前)
{
返回$this->domNodeList->item($this->position);
}
}
类递归DomNodeListator
扩展域节点标识符
实现递归迭代器
{
公共职能(儿童)
{
返回$this->current()->hasChildNodes();
}
公共函数getChildren()
{
返回新的self($this->current()->childNodes);
}
}
实际功能:

function DOMSubstr( DOMNode $domNode, $start = 0, $length = null )
{
    if( $start == 0 && ( $length == null || $length >= strlen( $domNode->nodeValue ) ) )
    {
        return;
    }

    $nodesToRemove = array();
    $rii = new RecursiveIteratorIterator( new RecursiveDOMNodeListIterator( $domNode->childNodes ), RecursiveIteratorIterator::SELF_FIRST );
    foreach( $rii as $node )
    {
        if( $start <= 0 && $length !== null && $length <= 0 )
        {
            /* can't remove immediately
             * because this will mess with
             * iterating over RecursiveIteratorIterator
             * so remember for removal, later on
             */
            $nodesToRemove[] = $node;
            continue;
        }

        if( $node->nodeType == XML_TEXT_NODE )
        {
            if( $start > 0 )
            {
                $count = min( $node->length, $start );
                $node->deleteData( 0, $count );
                $start -= $count;
            }

            if( $start <= 0 )
            {
                if( $length == null )
                {
                    break;
                }
                else if( $length <= 0 )
                {
                    continue;
                }
                else if( $length >= $node->length )
                {
                    $length -= $node->length;
                    continue;
                }
                else
                {
                    $node->deleteData( $length, $node->length - $length );
                    $length = 0;
                }
            }
        }
    }

    foreach( $nodesToRemove as $node )
    {
        $node->parentNode->removeChild( $node );
    }
}
函数DOMSubstr(DOMNode$DOMNode,$start=0,$length=null)
{
如果($start==0&&($length==null | |$length>=strlen($domNode->nodeValu
$html = <<<HTML
<p>Just a short text sample with <a href="#">a link</a> and some trailing elements such as <strong>strong text<strong>, <em>emphasized text</em>, <del>deleted text</del> and <ins>inserted text</ins></p>
HTML;

$dom = new DomDocument();
$dom->loadHTML( $html );
/*
 * this is particularly sloppy:
 * I pass $dom->firstChild->nextSibling->firstChild (i.e. <body>)
 * because the function uses strlen( $domNode->nodeValue )
 * which will be 0 for DOMDocument itself
 * and I didn't want to utilize DOMXPath in the function
 * but perhaps I should have
 */
DOMSubstr( $dom->firstChild->nextSibling->firstChild, 8, 25 );

/*
 * passing a specific node to DOMDocument::saveHTML()
 * only works with PHP >= 5.3.6
 */
echo $dom->saveHTML( $dom->firstChild->nextSibling->firstChild->firstChild );
<p> of "de Finibus</p> <p>Bonorum et Mal</p>