C++ 如何将QueryThreadCycleTime（）转换为秒？_C++_Windows_Winapi

C++ 如何将QueryThreadCycleTime（）转换为秒？

c++ windows winapi

C++ 如何将QueryThreadCycleTime（）转换为秒？,c++,windows,winapi,C++,Windows,Winapi,Windows函数给出给定线程使用的CPU时钟周期数。Windows手册中大胆地指出不要尝试将QueryThreadCycleTime返回的CPU时钟周期转换为经过的时间对于大多数Intel和AMD x86_64 CPU，我希望能做到这一点。它不需要非常精确，因为你无论如何也不能期望周期计数器达到完美。我只需要一些笨拙的方法来获取CPU的时间因子秒/查询ReadCycleTime 首先，我设想QueryThreadCycleTime在内部使用RDTSC。我设想在某些CPU上，使用恒定速

Windows函数给出给定线程使用的CPU时钟周期数。Windows手册中大胆地指出

不要尝试将QueryThreadCycleTime返回的CPU时钟周期转换为经过的时间

对于大多数Intel和AMD x86_64 CPU，我希望能做到这一点。它不需要非常精确，因为你无论如何也不能期望周期计数器达到完美。我只需要一些笨拙的方法来获取CPU的时间因子秒/查询ReadCycleTime

首先，我设想QueryThreadCycleTime在内部使用RDTSC。我设想在某些CPU上，使用恒定速率TSC，因此更改实际时钟速率（例如使用变频CPU电源管理）不会影响时间/TSC系数。在其他CPU上，这个速率可能会改变，所以我必须定期查询这个因子

为什么我需要这个？在任何人引用之前，我应该注意到我对其他解决方案并不感兴趣。这是因为我有两个其他方法无法满足的评测硬需求

它应该只测量线程时间，所以sleep1不应该返回1秒，但是持续1秒的繁忙循环应该返回1秒。换句话说，探查器不应该说一个任务在其线程仅活动1ms的情况下运行了10ms。这就是我不能使用的原因。它需要一个大于1/64秒的精度，这是由给定的精度。我正在分析的任务可能只运行几微秒。最小可复制示例按照@Ted Lyngmo的要求，目标是实现computeFactor

包括包括双计算因子； int main{ uint64_t开始、结束； QueryThreadCycleTimeGetCurrentThread，&start； //在此处插入任务，例如实际工作负载或睡眠1 QueryThreadCycleTimeGetCurrentThread，&end； printf%lf\n，结束-开始*computeFactor；返回0； } 不要尝试将QueryThreadCycleTime返回的CPU时钟周期转换为经过的时间

我想完全做到这一点

你的愿望显然被拒绝了

一种解决方法是创建一个具有稳定时钟的线程，以指定的频率对QueryThreadCycleTime和/或GetThreadTimes进行采样。下面是一个示例，演示如何使用采样线程每秒对这两个线程进行一次采样

#include <algorithm>
#include <atomic>
#include <chrono>
#include <cstdint>
#include <iostream>
#include <iomanip>
#include <thread>
#include <vector>

#include <Windows.h>

using namespace std::literals::chrono_literals;

struct FTs_t {
    FILETIME CreationTime, ExitTime, KernelTime, UserTime;
    ULONG64 CycleTime;
};

using Sample = std::vector<FTs_t>;

std::ostream& operator<<(std::ostream& os, const FILETIME& ft) {
    std::uint64_t bft = (std::uint64_t(ft.dwHighDateTime) << 16) + ft.dwLowDateTime;
    return os << bft;
}

std::ostream& operator<<(std::ostream& os, const Sample& smp) {
    size_t tno = 0;
    for (const auto& fts : smp) {
        os << " tno:" << std::setw(3) << tno << std::setw(10) << fts.KernelTime
           << std::setw(10) << fts.UserTime << std::setw(16) << fts.CycleTime << "\n";
        ++tno;
    }
    return os;
}

// the sampling thread
void ft_sampler(std::atomic<bool>& quit, std::vector<std::thread>& threads, std::vector<Sample>& samples) {
    auto tp = std::chrono::steady_clock::now(); // for steady sampling

    FTs_t fts;
    while (quit == false) {
        Sample s;
        s.reserve(threads.size());
        for (auto& th : threads) {
            if (QueryThreadCycleTime(th.native_handle(), &fts.CycleTime) &&
                GetThreadTimes(th.native_handle(), &fts.CreationTime,
                               &fts.ExitTime, &fts.KernelTime, &fts.UserTime)) {
                s.push_back(fts);
            }
        }
        samples.emplace_back(std::move(s));

        tp += 1s; // add a second since we last sampled and sleep until that time_point
        std::this_thread::sleep_until(tp);
    }
}

// a worker thread
void worker(std::atomic <bool>& quit, size_t payload) {
    volatile std::uintmax_t x = 0;
    while (quit == false) {
        for (size_t i = 0; i < payload; ++i) ++x;
        std::this_thread::sleep_for(1us);
    }
}

int main() {
    std::atomic<bool> quit_sampling = false, quit_working = false;
    std::vector<std::thread> threads;
    std::vector<Sample> samples;
    size_t max_threads = std::thread::hardware_concurrency() > 1 ? std::thread::hardware_concurrency() - 1 : 1;

    // start some worker threads
    for (size_t tno = 0; tno < max_threads; ++tno) {
        threads.emplace_back(std::thread(&worker, std::ref(quit_working), (tno + 100) * 100000));
    }

    // start the sampling thread
    auto smplr = std::thread(&ft_sampler, std::ref(quit_sampling), std::ref(threads), std::ref(samples));

    // let the threads work for some time
    std::this_thread::sleep_for(10s);

    quit_sampling = true;
    smplr.join();

    quit_working = true;
    for (auto& th : threads) th.join();

    std::cout << "Took " << samples.size() << " samples\n";

    size_t s = 0;
    for (const auto& smp : samples) {
        std::cout << "Sample " << s << ":\n" << smp << "\n";
        ++s;
    }
}

我对什么解决方案的替代方案不感兴趣？在代码中，您当前的非工作解决方案是什么？我认为这与问题无关，但它是当前的测量方法，以及如何使用计时。最终用户的结果就是我要求您提供当前非工作解决方案的原因，该解决方案是让您发布它以触发人们给出答案。我个人认为这是相关的，没有它，我不会试图回答。这很公平。您可以使用VCV Rack 1.1.4的Windows版本，通过启用Engine>CPU meter，在生产中尝试结果。当API被记录为不适合特定用途，并且该API的发布者对该主题非常了解时，MS不会这样做joeblow@mymomsbasement.com,你应该承认这是准确的信息。期望他们知道的比你少，你无论如何都可以让它工作，这有点不合理，而期望我们为你做这项工作更不合理。你的要求是行不通的，你应该改变你的立场，寻找替代方案，而不是浪费你的时间去做。这可能就是我要做的。换言之，计算一个近似的极限稳态时钟/QueryThreadCycleTime为time->infinity。我不明白的是，如何保证计算这种近似值的线程不会被操作系统切换上下文，从而扰乱稳定的时钟测量。我可以计算几次，取最小值。如果你想要保证，你需要一个实时操作系统。作为一名观察者，在没有RTOS的情况下，您可以使用稳定的时钟进行补偿。这一部分不会太糟糕。嗯，现在我想起来了，我可以使用Windows的GetThreadTimes来进行测量。它与QueryThreadCycleTime在同一时间参考帧中运行，与稳定时钟不同，因此前两个测量值之间的比率是我所追求的因素！我可以启动一个忙碌等待大约1秒的线程，测量GetThreadTimes和QueryThreadCycleTime，将它们分开，这就是原因！这听起来是一个更好的选择，但我建议使用稳定的时钟或高分辨率的时钟和std:：this\u thread:：sleep\u，直到进行实际采样。我不确定这将如何工作，因为如果一个线程只是作为其工作负载产生，GetThreadTimes和QueryThreadCycleTime都将返回~0。这些函数的要点是在线程处于非活动状态时不计数。