Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/80.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
从字符串中剥离HTML标记_Html_Ios_Swift - Fatal编程技术网

从字符串中剥离HTML标记

从字符串中剥离HTML标记,html,ios,swift,Html,Ios,Swift,如何从字符串中删除HTML标记,以便输出干净的文本 let str = string.stringByReplacingOccurrencesOfString("<[^>]+>", withString: "", options: .RegularExpressionSearch, range: nil) print(str) let str=string.stringByReplacingOfString(“]+>”,带字符串:“”,选项:。RegularExpressio

如何从字符串中删除HTML标记,以便输出干净的文本

let str = string.stringByReplacingOccurrencesOfString("<[^>]+>", withString: "", options: .RegularExpressionSearch, range: nil)
print(str)
let str=string.stringByReplacingOfString(“]+>”,带字符串:“”,选项:。RegularExpressionSearch,范围:nil)
打印(str)

嗯,我尝试了你的函数,它在一个小例子中起了作用:

var string = "<!DOCTYPE html> <html> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html>"
let str = string.stringByReplacingOccurrencesOfString("<[^>]+>", withString: "", options: .RegularExpressionSearch, range: nil)
print(str)

//output "  My First Heading My first paragraph. "
var string=“我的第一个标题我的第一段。

” 让str=string.StringByReplacingOfString(“]+>”,带字符串:“”,选项:。RegularExpressionSearch,范围:nil) 打印(str) //输出“我的第一个标题我的第一段”
你能举个问题的例子吗

Swift 4和5版本:

var string = "<!DOCTYPE html> <html> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html>"
let str = string.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil)
var string=“我的第一个标题我的第一段。

” 让str=string.replacingOccurrences(of:“]+>”,with:”,选项:。正则表达式,范围:nil)
因为HTML不是一种语言(HTML是一种语言),所以不能使用正则表达式。见:

我会考虑使用NStestDead字符串代替.< /P>

let htmlString = "LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"    
let htmlStringData = htmlString.dataUsingEncoding(NSUTF8StringEncoding)!
let options: [String: AnyObject] = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding]
let attributedHTMLString = try! NSAttributedString(data: htmlStringData, options: options, documentAttributes: nil)
let string = attributedHTMLString.string

我使用以下扩展来删除特定的HTML元素:

extension String {
    func deleteHTMLTag(tag:String) -> String {
        return self.stringByReplacingOccurrencesOfString("(?i)</?\(tag)\\b[^<]*>", withString: "", options: .RegularExpressionSearch, range: nil)
    }

    func deleteHTMLTags(tags:[String]) -> String {
        var mutableString = self
        for tag in tags {
            mutableString = mutableString.deleteHTMLTag(tag)
        }
        return mutableString
    }
}
扩展字符串{
func deleteHTMLTag(标记:String)->String{
返回self.stringbyreplaceingoccurrencesofstring((?i)swift 4:

extension String {
    func deleteHTMLTag(tag:String) -> String {
        return self.replacingOccurrences(of: "(?i)</?\(tag)\\b[^<]*>", with: "", options: .regularExpression, range: nil)
    }

    func deleteHTMLTags(tags:[String]) -> String {
        var mutableString = self
        for tag in tags {
            mutableString = mutableString.deleteHTMLTag(tag: tag)
        }
        return mutableString
    }
}
扩展字符串{
func deleteHTMLTag(标记:String)->String{

返回self.replacingOccurrences(of):(?i)Mohamed溶液,但在Swift 4中作为字符串扩展

extension String {

    func stripOutHtml() -> String? {
        do {
            guard let data = self.data(using: .unicode) else {
                return nil
            }
            let attributed = try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
            return attributed.string
        } catch {
            return nil
        }
    }
}
为Swift 4更新:

guard let htmlStringData=htmlString.data(使用:.unicode)else{fatalError()}
let选项:[NSAttributedString.DocumentReadingOptionKey:Any]=[
.documentType:nsAttributeString.documentType.html
.characterEncoding:String.Encoding.unicode.rawValue
]
让attributedHTMLString=try!NSAttributedString(数据:htmlStringData,选项:options,文档属性:nil)
让string=attributedHTMLString.string
扩展字符串{
var htmlStripped:String{
返回self.replacingOccurrences(of:“]+>”,with:,选项:。regularExpression,范围:nil)
}
}

快乐编码

我更喜欢使用正则表达式,而不是使用NSAttributedString HTML转换,请注意,这相当耗时,而且还需要在主线程上运行。 更多信息请点击此处:

对我来说,这就是诀窍,首先我删除了所有CSS内联样式,然后删除了所有HTML标记。可能不像NSAttributedString选项那样可靠,但对于我来说速度更快

extension String {
    func withoutHtmlTags() -> String {
        let str = self.replacingOccurrences(of: "<style>[^>]+</style>", with: "", options: .regularExpression, range: nil)
        return str.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil)
    }
}
扩展字符串{
func without htmltags()->字符串{
设str=self.replacingOccurrences(of:“[^>]”,with:”,选项:。正则表达式,范围:nil)
return str.replacingOccurrences(of:“]+>”,with:,选项:。正则表达式,范围:nil)
}
}

Swift 5

extension String {
    public func trimHTMLTags() -> String? {
        guard let htmlStringData = self.data(using: String.Encoding.utf8) else {
            return nil
        }
    
        let options: [NSAttributedString.DocumentReadingOptionKey : Any] = [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ]
    
        let attributedString = try? NSAttributedString(data: htmlStringData, options: options, documentAttributes: nil)
        return attributedString?.string
    }
}
使用:

let str=“我的html”
打印(str.trimHTMLTags()??“-”/“我的html链接文本”

例如,请尝试以下HTML片段:

Swift 3

字符串中的段落。replacingOccurrences(of:“]+>”,with:”,选项:string.CompareOptions.regularExpression,范围:nil)
@Husam感谢您提供swift3版本,但在文本视图中,我得到了类似于
p>
等的内容。您知道为什么吗?在Swift 4字符串中。replacingOccurrences(of:“]+>”,with:,options:。regularExpression,range:nil)Led,这个问题很有价值,但实际上,它很可能会被关闭,因为你没有问一个明确的问题:这是一个不可复制的场景。我建议你按照重新措辞你的问题。我不希望这个问题被删除。lol stackoverflow…这是如何作为“离题”关闭的?这是谷歌的#1结果“Swift删除html标记”。@canhazbits我知道正确!单击“重新打开”指定它重新打开。Swift 3:string.replacingOccurrences(of:“]+>”,with:”,选项:。正则表达式,范围:nil)“Lister先生,有没有办法删除所有HTML标记并保存它?这似乎是最干净的方法,它工作得非常好!最好让经测试的基础框架为您处理这个问题,而不是自己编写薄薄的解析器。干净!!<代码>让属性化=尝试NSAttributedString(数据:HTMLStudio.Data(使用:.Unicode))。!,选项:[NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType],DocumentAttribute:nil)打印(属性化.string)
大多数人更喜欢选择小且易于理解的答案。感谢您提供的解决方案!在删除html标记时是否可以保存空格和换行符?目前,新字符串中的所有换行符都将被忽略。使用此选项只是一个警告:html样式转换(属性)慢!WWDC的一位CoreText工程师告诉我,这不再被维护,他已经完全忘记了。只是对前面警告的一个警告:在放弃一个太“慢”的方法之前,让我们先看看一些数据。您使用了很多C库(通常没有意识到)这不需要太多维护。这不一定是一件坏事。或者您可以这样使用:func deleteThMLTAG()->String{return self.replacingOccurrences(of):(?i)此正则表达式不会为我删除html代码。示例字符串:“Cats like do”.没有进一步调查它不起作用的原因。但是text.replacingOccurrences(of:“]+>”,..)适用于我的简单案例。在.documentType:paramSource之后缺少一个“,”
extension String{
    var htmlStripped : String{
        return self.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil)
    }
}
extension String {
    func withoutHtmlTags() -> String {
        let str = self.replacingOccurrences(of: "<style>[^>]+</style>", with: "", options: .regularExpression, range: nil)
        return str.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil)
    }
}
extension String {
    public func trimHTMLTags() -> String? {
        guard let htmlStringData = self.data(using: String.Encoding.utf8) else {
            return nil
        }
    
        let options: [NSAttributedString.DocumentReadingOptionKey : Any] = [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ]
    
        let attributedString = try? NSAttributedString(data: htmlStringData, options: options, documentAttributes: nil)
        return attributedString?.string
    }
}
let  str = "my html <a href='https://www.google.com'>link text</a>"

print(str.trimHTMLTags() ?? "--") //"my html link text"