C++ 如何从文本文件中散列信息？_C++

C++ 如何从文本文件中散列信息？

c++

C++ 如何从文本文件中散列信息？,c++,C++,我要做的是从一个文本文件中读取一行，把它分解成组成它的单词，然后对照我不想散列的“坏单词”列表检查每个单词。不在坏单词列表中的每一个“好单词”都应该被散列，并将整行存储在其索引中（如果有意义的话）。例如，“火之环”可以分为“火之环”、“火之环”和“火之环”。我会将“戒指”散列并存储“火之戒指”，我会看到“of”并注意到这是一个不好的词并跳过它，最后我会将“火”散列并存储“火之戒指” 我的代码按原样将一行分隔为单词，将其与坏单词进行比较，并显示所有好单词。然后关闭文件，重新打开文件，并显示所有行。

我要做的是从一个文本文件中读取一行，把它分解成组成它的单词，然后对照我不想散列的“坏单词”列表检查每个单词。不在坏单词列表中的每一个“好单词”都应该被散列，并将整行存储在其索引中（如果有意义的话）。例如，“火之环”可以分为“火之环”、“火之环”和“火之环”。我会将“戒指”散列并存储“火之戒指”，我会看到“of”并注意到这是一个不好的词并跳过它，最后我会将“火”散列并存储“火之戒指”

我的代码按原样将一行分隔为单词，将其与坏单词进行比较，并显示所有好单词。然后关闭文件，重新打开文件，并显示所有行。我在概念化上遇到的困难是如何将两者结合起来，同时散列所有好的单词和整行内容，这样我就可以轻松地存储它们。我该怎么做呢

#include <cstring>
#include <cctype>
#include <iostream>
#include <fstream>
using namespace std;

int main()
{
    const char * bad_words[] = {"of", "the", "a", "for", "to", "in", "it", "on", "and"};
    ifstream file;
    file.open("songs.txt");
    //if(!file.is_open()) return;
    char word[50];

while(file >> word)
{
    // if word == bad word, dont hash
    // else hash and store it in my hash table
    bool badword = false;
    for(int i = 0; i < 9; ++i)
    {
        if(strcmp(word, bad_words[i]) == 0)
        {
            badword = true;
        }
    }

    if(badword) continue;
    else
    {
        // get all words in a line that are not in bad_words
        char * good_word = new char[strlen(word)+1];
        strcpy(good_word, word);
        cout << good_word << endl;  // testing to see if works      

        // hash each good_word, store good_line in both of them

        //int index = Hash(good_word);
        //Add(good_line) @ table[index];
    }
}

file.close();
file.open("songs.txt");
while(!file.eof())  // go through file, grab each whole line. store it under the hash of good_word (above)
{
    char line[50];
    file.getline(line, 50, '\n');
    char * good_line = new char[strlen(line)+1];
    strcpy(good_line, line);
    cout << good_line << endl;  // testing to see if works
}

return 0;
}

#包括
#包括
#包括
#包括
使用名称空间std；
int main（）
{
const char*bad_单词[]={“of”，“the”，“a”，“for”，“to”，“in”，“it”，“on”，“and”}；
ifstream文件；
打开（“songs.txt”）；
//如果（！file.is_open（））返回；
字符字[50]；
while（文件>>word）
{
//如果单词==坏单词，不要散列
//else散列并将其存储在我的散列表中
bool-badword=false；
对于（int i=0；i<9；++i）
{
if（strcmp（字，坏字[i]）==0）
{
坏词=真；
}
}
如果（坏词）继续；
其他的
{
//把所有的单词排成一行，不要用脏话
char*good_word=新字符[strlen（word）+1]；
strcpy（good_word，word）；
您似乎正在寻找std:：无序的\u多重映射

我可能还会对“坏”字集进行排序，并使用std:：binary\u search
查看它是否包含特定的字
std::vector<std::string> bad { "a", "and", "for" /* ... keep sorted */};

std::unordered_multimap<std::string, std::string> index;

while (std::getline(infile, line)) {
    std::istringstream buf(line);
    std::string word;
    while (buf >> word)
       if (!binary_search(bad.begin(), bad.end(), word))
           index.insert(std::make_pair(word, line));
}

std:：vector bad{“a”，“and”，“for”/*..保持排序*/}；
std：：无序的_多重映射索引；
while（std:：getline（infle，line））{
std:：istringstream buf（线路）；
字符串字；
while（buf>>word）
如果（！二进制搜索（bad.begin（），bad.end（），word））
插入索引（std:：make_pair（字、行））；
}
如果您确实必须实现自己的哈希表，您可以找到哈希表数据结构的描述
在最简单的形式中，哈希表是一个链表数组。该数组使用hascode%arraySize编制索引，链表处理哈希冲突。
如果从char[]切换，生活会更好<代码> >代码> STD:：String 。当使用调试器时，哪一行引起问题？如果您使用C++中的文件，请使用QFrm尝试QT库。它使您的生活更容易。当两个不同的字符串生成相同的哈希代码时，您打算怎么做？如果只存储哈希代码，这可能是个问题。