Python中csv文件的词干分析_Python_Csv_Stop Words_Stemming

Python中csv文件的词干分析

python csv

Python中csv文件的词干分析,python,csv,stop-words,stemming,Python,Csv,Stop Words,Stemming,好的，我用Python编写了这段代码，其中导入了两个csv文件。第一个csv文件名为“claims”（一列，多行），另一个名为“sexualharsament”（一列，多行）。程序现在检查所有“claims”行，查看是否包含“sexualharsament”中的任何单词，如果包含，则将该行输出到名为“output”的新csv文件中它还消除了我选择的某些停止词。对行中的每个单词进行迭代，并在其上调用stem方法，这到底是如何工作的@帕德雷坎宁厄姆 from nltk import Port

好的，我用Python编写了这段代码，其中导入了两个csv文件。第一个csv文件名为“claims”（一列，多行），另一个名为“sexualharsament”（一列，多行）。程序现在检查所有“claims”行，查看是否包含“sexualharsament”中的任何单词，如果包含，则将该行输出到名为“output”的新csv文件中它还消除了我选择的某些停止词。对行中的每个单词进行迭代，并在其上调用stem方法，这到底是如何工作的@帕德雷坎宁厄姆

    from nltk import PorterStemmer
    PorterStemmer().stem_word('discriminated')
    >>>discriminate

    import csv
    with open("claims.csv") as file1, open("masterlist.csv") as file2,
    open("stopwords.csv") as file3, open("output.csv", "wb+") as file4:
        writer = csv.writer(file4)
        key_words = [word.strip() for word in file2.readlines()]
        stop_words = [' also ', ' although ', ' always ', ' and ', ' any ', ' are ', ' as ', ' at ',\
              ' around ', ' be ', ' by ', ' for ', ' from ', ' has ', ' on ', ' that ', ' were ', ' will ',\
              ' with ' ' can ', ' cannot ', ' if ', ' it ', ' the ', ' there ', ' which ', ' in ', ' is ',\
              ' its ', ' me ', ' of ', ' was ', ' then ', ' with ', ' a ', ' an ', ' to ', ' to ', ' when ',\
              ' however ', '"', ',', '.', '-', '?', '!', '(', ')']
        for row in file1:
            row = row.strip()
            row = row.lower()
            for stopword in stop_words:
                if stopword in row:
                    row = row.replace(stopword," ")
            for key in key_words:
                if key in row:
                    writer.writerow([key, row])
                    break