Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/318.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 停止删除无法正常工作的单词_Python_Regex_String_Split_Stop Words - Fatal编程技术网

Python 停止删除无法正常工作的单词

Python 停止删除无法正常工作的单词,python,regex,string,split,stop-words,Python,Regex,String,Split,Stop Words,知道为什么停止字删除不能正常工作吗?它错误地替换了内容,有时将say a替换为an,或者不能将视为单个单词 stop_words=open("stopwords.txt") stop_words=stop_words.read().split("\n") print stop_words for line in splitted_tweets: #print line #print "***************************************" if

知道为什么停止字删除不能正常工作吗?它错误地替换了内容,有时将say a替换为an,或者不能将
视为单个单词

stop_words=open("stopwords.txt")
stop_words=stop_words.read().split("\n")
print stop_words
for line in splitted_tweets:
    #print line
    #print "***************************************"
    if (line.__contains__("text='")):
        start_index=line.index("text='")+6
        end_index=line.index("',", start_index)
        tweet=line[start_index:end_index]
        print tweet
        print "**********"
        tweet_words = re.sub("[^\w]", " " , tweet).split()
        print tweet_words
        for word in stop_words:
                if word in tweet_words:
                        print word
                        tweet=tweet.replace(word, "")

        print "?????????????????????????"
        print tweet
以下是一些示例输出:

['RT', 'sayingsforgirls', 'Do', 'not', 'touch', 'MY', 'iPhone', 'It', 's', 'not', 'an', 'usPhone', 'it', 's', 'not', 'a', 'wePhone', 'it', 's', 'not', 'an', 'ourPhone', 'it', 's', 'an', 'iPhone']
a
an
it
not
?????????????????????????
RT @syingsforgirls: Do  touch MY iPhone. It's  n usPhone, 's   wePhone, 's  n ourPhone, 's n iPhone.
Do not touch MY iPhone. It's not an usPhone, it's not a wePhone, it's not an ourPhone, it's an iPhone.
**********
['Do', 'not', 'touch', 'MY', 'iPhone', 'It', 's', 'not', 'an', 'usPhone', 'it', 's', 'not', 'a', 'wePhone', 'it', 's', 'not', 'an', 'ourPhone', 'it', 's', 'an', 'iPhone']
a
an
it
not
?????????????????????????
Do  touch MY iPhone. It's  n usPhone, 's   wePhone, 's  n ourPhone, 's n iPhone.
RT @BrianaaSymonee: she says imma dog, but it takes one to know one...
**********
['RT', 'BrianaaSymonee', 'she', 'says', 'imma', 'dog', 'but', 'it', 'takes', 'one', 'to', 'know', 'one']
but
it
she
to
?????????????????????????
RT @BrianaaSymonee:  says imma dog,   takes one  know one...
she says imma dog, but it takes one to know one...
**********

略为O/T,但
行。uuu包含
通常会写入
“text=””行中
。谢谢,但该行正在工作!。。。我没有说它不是,但是如果你遵循你所使用的语言的惯例,它会使你的代码更容易阅读、理解和调试。请看一看,并将其精简为一个。你期望得到什么结果?我是@jornsharpe的。如果你想得到一个正确的答案,你需要解释到底是什么问题。@Kasramvd如果你看看我提到的问题,比如“it's”,它把它分成了“it”,“s”,这是不正确的。我只想说一个字。