C++ 字符串在转换为c_str（）后结束时获取垃圾_C++_File_File Io

C++ 字符串在转换为c_str（）后结束时获取垃圾

c++ file file-io

C++ 字符串在转换为c_str（）后结束时获取垃圾,c++,file,file-io,C++,File,File Io,这是一个家庭作业，只是为了所有想知道的人我正在编写一个词汇翻译程序（英语->德语，反之亦然），我应该将用户所做的一切保存到文件中。很简单代码如下： std::string file_name(user_name + ".reg"); std::ifstream file(file_name.c_str(), std::ios::binary | std::ios::ate); // At this point, we have already verified the file exists

这是一个家庭作业，只是为了所有想知道的人

我正在编写一个词汇翻译程序（英语->德语，反之亦然），我应该将用户所做的一切保存到文件中。很简单

代码如下：

std::string file_name(user_name + ".reg");
std::ifstream file(file_name.c_str(), std::ios::binary | std::ios::ate);
// At this point, we have already verified the file exists. This shouldn't ever throw!
// Possible scenario:  user deletes file between calls.
assert( file.is_open() );

// Get the length of the file and reset the seek.
size_t length = file.tellg();
file.seekg(0, std::ios::beg);

// Create and write to the buffer.
char *buffer = new char[length];
file.read(buffer, length);
file.close();

// Find the last comma, after which comes the current dictionary.
std::string strBuffer = buffer;
size_t position = strBuffer.find_last_of(',') + 1;
curr_dict_ = strBuffer.substr(position);

// Start the trainer; import the dictionary.
trainer_.reset( new Trainer(curr_dict_.c_str()) );

显然，问题在于应该存储字典值的curr_dict。例如，我的老师有一个名为

10WS_PG2_P4_de_en_gefuehle.txt

的字典文件。培训师导入字典文件的全部内容，如下所示：

std::string s_word_de;
std::string s_word_en;
std::string s_discard;
std::string s_count;
int i_word;

std::ifstream in(dictionaryDescriptor);

if( in.is_open() )
{
    getline(in, s_discard); // Discard first line.
    while( in >> i_word &&
        getline(in, s_word_de, '<') &&
        getline(in, s_discard, '>') &&
        getline(in, s_word_en, '(') &&
        getline(in, s_count, ')') )
    {   
        dict_.push_back(NumPair(s_word_de.c_str(), s_word_en.c_str(), Utility::lexical_cast<int, std::string>(s_count)));
    }
}
else
    std::cout << dictionaryDescriptor;

我做错了什么？

读取

函数（如第行中的file.read（buffer，length）；
）不终止字符缓冲区。您需要手动执行此操作（再分配一个字符，并将nul置于read
ing之后的gcount
th位置）。
read

函数（如

file.read（buffer，length）；

不会nul终止字符缓冲区。您需要手动执行此操作（再分配一个字符，并将nul放在

read

ing之后的

gcount

th位置）。

我会这样做：

std::string strBuffer(length, '\0');
myread(file, &strBuffer[read], length); // guranteed to read length bytes from file into buffer

完全避免需要中间缓冲区。

我会这样做：

std::string strBuffer(length, '\0');
myread(file, &strBuffer[read], length); // guranteed to read length bytes from file into buffer

完全避免需要中间缓冲区。

此状态泄漏：

char*buffer=new char[length]首选<代码>标准：：向量缓冲区（长度）或者更好，不要读入字符缓冲区。（作为奖励，这也可以防止这个特定的bug…@Martin:谢谢，修复了@鲁道夫：看起来很甜美；我将尝试集成它。此语句泄漏：char*buffer=new char[length]首选<代码>标准：：向量缓冲区（长度）或者更好，不要读入字符缓冲区。（作为奖励，这也可以防止这个特定的bug…@Martin:谢谢，修复了@鲁道夫：看起来很甜美；我会尝试集成它，所以实际上我还没有使用gcount，但是根据cplusplus告诉我的，它只是返回最后读取的字符的位置。好极了！如何使用它将空终止符读取到字符*？您需要手动将其放入（例如缓冲区[file.gcount（）]=0
。这也是为什么您需要分配一个额外字符（使用字符缓冲区或向量方法时）。因此，我实际上还没有使用gcount，但根据cplusplus告诉我的，它只是返回最后一个读取字符的位置。太好了！我如何使用它将空终止符读取到字符*？您需要手动将其放入（例如buffer[file.gcount（）]=0。这也是您需要分配一个额外字符的原因（当使用字符缓冲区或向量方法时）。-1:依赖std:：string实现的内部。任何由非连续存储实现的std:：string都将失败。请参阅@Zan Lynx:我不同意。1）C++03确实要求&str[0]返回指向2的指针字符串长度不受读取的影响（因为字符串与字符串数据分开维护长度（即，它不依赖于以“\0”结尾的字符串））。原因是data（）c_str（）和运算符[]这样做是为了允许（但不要求）实现提供string的引用计数版本。因此，请删除错误的-1。很好，运算符[]将返回连续存储。然而，长度仍然是错误的。它将被设置为文件的长度，但不能保证读操作确实读取了那么多数据。我假设用户知道读操作可能在完成之前返回，因此需要一个循环来读取完整的文件。这是另一个问题。示例代码就是（如上所述）不进行错误检查的代码，因此原始代码提供了与OP完全相同的功能。但是为了迂腐，我添加了一个必要的循环来保证文件被读取到缓冲区中。还应该考虑文件内容在文件长度和读取数据之间被重写的情况。- 1：依赖于STD：：字符串实现的内部结构。由非连续存储实现的任何std:：string都将失败。请看@Zan Lynx：我不敢苟同。1） C++03确实要求&str[0]返回指向2）字符串长度不受读取的影响（因为字符串与字符串数据分开维护长度（即，它不依赖于以“\0”结尾的字符串）。原因是data（）c_str（）和运算符[]这样做是为了允许（但不要求）实现提供string的引用计数版本。因此，请删除错误的-1。很好，运算符[]将返回连续存储。然而，长度仍然是错误的。它将被设置为文件的长度，但不能保证读操作确实读取了那么多数据。我假设用户知道读操作可能在完成之前返回，因此需要一个循环来读取完整的文件。这是另一个问题。示例代码就是（如上所述）不进行错误检查的代码，因此原始代码提供了与OP完全相同的功能。但为了迂腐，我添加了一个必要的循环来保证文件被读取到缓冲区中，也应该考虑文件内容在文件长度和读取数据之间被重写的情况。
std::string strBuffer(length, '\0');
myread(file, &strBuffer[read], length); // guranteed to read length bytes from file into buffer