Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从坏单词列表创建审查函数_Python_Python 2.7 - Fatal编程技术网

Python 从坏单词列表创建审查函数

Python 从坏单词列表创建审查函数,python,python-2.7,Python,Python 2.7,我正在尝试创建一个函数来检查字符串中的单词。这是一种工作,有一些怪癖 这是我的代码: def censor(sentence): badwords = 'apple orange banana'.split() sentence = sentence.split() for i in badwords: for words in sentence: if i in words: pos = sent

我正在尝试创建一个函数来检查字符串中的单词。这是一种工作,有一些怪癖

这是我的代码:

def censor(sentence):
    badwords = 'apple orange banana'.split()
    sentence = sentence.split()

    for i in badwords:
        for words in sentence:
            if i in words:
                pos = sentence.index(words)
                sentence.remove(words)
                sentence.insert(pos, '*' * len(i))

    print " ".join(sentence)

sentence = "you are an appletini and apple. new sentence: an orange is a banana. orange test."

censor(sentence)
以及输出:

you are an ***** and ***** new sentence: an ****** is a ****** ****** test.
一些标点符号消失,单词
“appletii”
被错误地替换

如何解决这个问题

还有,有没有更简单的方法来做这类事情?

试试:

for i in bad_word_list:
    sentence = sentence.replace(i, '*' * len(i))

具体问题是:

你根本不考虑标点符号;及
  • 插入
    '*'
    s时,使用的是“坏单词”的长度,而不是单词的长度
  • 我会切换循环顺序,这样您只需处理一次句子,并使用而不是
    remove
    insert

    def censor(sentence):
        badwords = ("test", "word") # consider making this an argument too
        sentence = sentence.split()
    
        for index, word in enumerate(sentence):
            if any(badword in word for badword in badwords):
                sentence[index] = "".join(['*' if c.isalpha() else c for c in word])
    
        return " ".join(sentence) # return rather than print
    
    测试将仅用星号替换大写和小写字母。演示:

    >>> censor("Censor these testing words, will you? Here's a test-case!")
    "Censor these ******* *****, will you? Here's a ****-****!"
                # ^ note length                         ^ note punctuation
    

    请注意潜在编辑的.Note:。用正则表达式单词边界结束坏单词可以解决标点问题。@kojiro正如你在问题评论中已经非常简洁地指出的,编写这样的审查程序总是会有问题的,因此,似乎没有必要用正则表达式进一步使事情复杂化!谢谢你的帮助!我是python新手,所以我还没有使用过“any”和枚举,但我会继续使用它。