C++ 为什么使用istreambuf迭代器读取文件会因为重复执行而变得更快？_C++_Visual C++_Benchmarking

C++ 为什么使用istreambuf迭代器读取文件会因为重复执行而变得更快？

c++ visual-c++

C++ 为什么使用istreambuf迭代器读取文件会因为重复执行而变得更快？,c++,visual-c++,benchmarking,C++,Visual C++,Benchmarking,我在寻找一种将整个文件读入字符串的方法。我在网上找到了一些技术，并决定对其中两种进行测试，但结果很奇怪我正在Windows 10笔记本电脑上使用Visual Studio Community 2019（版本16.0.3）。文件“my_text.txt”的长度为2235259个字符，大小为2.183MB 以下是完整的代码： #include <chrono> #include <fstream> #include <iostream> #include <

我在寻找一种将整个文件读入字符串的方法。我在网上找到了一些技术，并决定对其中两种进行测试，但结果很奇怪

我正在Windows 10笔记本电脑上使用Visual Studio Community 2019（版本16.0.3）。文件“my_text.txt”的长度为2235259个字符，大小为2.183MB

以下是完整的代码：

#include <chrono>
#include <fstream>
#include <iostream>
#include <string>

// first technique
void read_string_1(std::ifstream& fstr, std::string& result)
{
    fstr.seekg(0, std::ios::end);
    size_t length = fstr.tellg();
    fstr.seekg(0);
    result = std::string(length + 1, '\0');
    fstr.read(&result[0], length);
}

// second technique
void read_string_2(std::ifstream& fstr, std::string& result)
{
    result = std::string( (std::istreambuf_iterator<char>(fstr)), (std::istreambuf_iterator<char>()) );
}

int main()
{
    std::ifstream ifile{ "my_text.txt", std::ios_base::binary };
    if (!ifile)
        throw std::runtime_error("Error!");

    std::string content;

    for (int i = 0; i < 10; ++i)
    {
        std::chrono::high_resolution_clock::time_point p1 = std::chrono::high_resolution_clock::now();
        read_string_1(ifile, content);
        std::chrono::high_resolution_clock::time_point p2 = std::chrono::high_resolution_clock::now();
        auto duration1 = std::chrono::duration_cast<std::chrono::microseconds>(p2 - p1).count();
        std::cout << "M1:" << duration1 << std::endl;
    }

    for (int i = 0; i < 10; ++i)
    {
        std::chrono::high_resolution_clock::time_point p3 = std::chrono::high_resolution_clock::now();
        read_string_2(ifile, content);
        std::chrono::high_resolution_clock::time_point p4 = std::chrono::high_resolution_clock::now();
        auto duration2 = std::chrono::duration_cast<std::chrono::microseconds>(p4 - p3).count();
        std::cout << "M2:" << duration2 << std::endl;
    }

    return 0;
}

案例2：首先调用read_string_2（），然后调用read_string_1（）

当然，每次的结果都不一样，但它们遵循一种普遍的模式。正如您所看到的，read_string_1（）非常一致，但是read_string_2（）的执行时间令人费解。为什么在这两种情况下，重复执行会更快？为什么，在案例2中，在第一次运行中执行要花这么长时间？背景里发生了什么？我做错什么了吗？最后，哪个函数更快，read_string_1（）还是read_string_2（）？

由于缓存，执行速度会更快

通过查找，需要花时间浏览文件。因此，虽然缓存了一些东西，但差别并不大。通过直接读取，可以缓存文件内容本身。所以再次读取它只是指向缓存内存的指针

第一次尝试所需的时间取决于缓存中的内容和操作本身。

通常，执行的读取次数越少，文件加载到内存的速度就越快。既然您想要整个文件，为什么不执行一次读取呢？如果Windows也是这样，那么您可以使用file mapping.cache hot vs.cache coldSo得出结论1）如果缓存为空，则read_string_1（）比read_string_2（）快得多；2）为了测量执行时间，最好调用一次函数并多次运行程序本身，对吗？1）是的。2）不是真的。您需要冷启动，而不是重复相同的过程。你可以尝试重新启动或随机测量。好的，我知道了。谢谢。

M1:7389
M1:8821
M1:6303
M1:6725
M1:5951
M1:8097
M1:5651
M1:6156
M1:6110
M1:5848
M2:827
M2:15
M2:15
M2:15
M2:14
M2:13
M2:14
M2:13
M2:14
M2:14

M1:940311
M1:352
M1:16
M1:13
M1:15
M1:15
M1:13
M1:13
M1:14
M1:14
M2:4668
M2:4761
M2:4881
M2:7446
M2:5050
M2:5572
M2:5255
M2:5108
M2:5234
M2:5072