C++ 优化这一点；巧合搜索“；算法，为了速度_C++_Algorithm_Performance_Optimization_Micro Optimization

C++ 优化这一点；巧合搜索“；算法，为了速度

c++ algorithm performance optimization

C++ 优化这一点；巧合搜索“；算法，为了速度,c++,algorithm,performance,optimization,micro-optimization,C++,Algorithm,Performance,Optimization,Micro Optimization,我写了一个算法，设计用来模拟一个实验产生的数据，然后对该数据执行“巧合搜索”（稍后会有更多内容…）。所讨论的数据是一个向量，元素从高斯分布（或多或少，随机数）中选取。每个“列”代表一个“数据流”，每一行代表一个瞬间。必须保留“数组”中每个元素的“位置” 算法：该算法设计用于执行以下任务：同时迭代所有n列（数据流），并计算至少c唯一列具有绝对值大于某个阈值的元素的次数，以便元素位于指定的时间间隔内（即一定数量的行）当发生这种情况时，我们将一个计数器添加到一个计数器中，然后按指定的数量在时间

我写了一个算法，设计用来模拟一个实验产生的数据，然后对该数据执行“巧合搜索”（稍后会有更多内容…）。所讨论的数据是一个

向量

，元素从高斯分布（或多或少，随机数）中选取。每个“列”代表一个“数据流”，每一行代表一个瞬间。必须保留“数组”中每个元素的“位置”

算法：

该算法设计用于执行以下任务：

同时迭代所有

列（数据流），并计算至少

唯一列具有绝对值大于某个阈值的元素的次数，以便元素位于指定的时间间隔内（即一定数量的行）

当发生这种情况时，我们将一个计数器添加到一个计数器中，然后按指定的数量在时间（按行）上向前跳。我们重新开始，直到我们遍历了整个“数组”。最后，我们返回计数器的值（“重合数”）

我的解决方案：

我先给出代码，然后逐段介绍并解释其操作（同时希望澄清一些细节）：

我想要的是列索引，所以我使用

std:：distance

来获取它，并将其存储在

std:：set

，

缓存中。我在这里选择std:：set
，因为我感兴趣的是在某个时间（即行）间隔内计算值超过value\u阈值的唯一列的数量。通过使用std:：set
，我只需转储每个此类值的列索引，重复项就会“自动删除”。然后，稍后，我可以简单地检查缓存的大小，如果它大于或等于指定的数字（num\u columns
），我就发现了一个“巧合”
在获得每个超过value\u threshold
的值的列索引后，我检查缓存的大小，看看是否找到了足够多的唯一列。如果有，我将一个添加到重合度\u计数器
，清除缓存
，然后在“时间”（即行）中向前跳转一定量（此处为4004000-时间\u计数器
）。请注意，我减去time\u计数器
，它从第一个找到的超过value\u阈值的值中跟踪“时间”（#行）。我想从那个出发点及时向前跳
        if(size(cache) >= num_columns){

            ++coincidence_counter;
            cache.clear();

            if(distance(row_ctr, end(waveform)) > (4004000 - time_counter)){
                advance(row_ctr, ((4004000 - time_counter)));
            } else {
                return coincidence_counter;
            }

        }

最后，我检查时间计数器。请记住，num_列
唯一列必须在彼此的某个时间（即行）阈值内。我从第一个发现的超过值\u阈值的值开始计时。如果我已经超过了时间阈值，我要做的是清空缓存（）
，然后使用第二个发现的超过值阈值的值（如果有）作为新的第一个发现值重新开始，并希望使用该值作为起始点来找到一个巧合
        if(size(cache) >= num_columns){

            ++coincidence_counter;
            cache.clear();

            if(distance(row_ctr, end(waveform)) > (4004000 - time_counter)){
                advance(row_ctr, ((4004000 - time_counter)));
            } else {
                return coincidence_counter;
            }

        }

我不再跟踪每个找到的值的时间（即行索引），而是从第一个找到的值（即time\u counter+1
）之后的一个值开始
我还为每个循环的time\u计数器
添加一个，如果cache
的大小0
（我想从第一个发现的值开始计算时间（即行），该值超过value\u阈值
）

尝试的优化：
我不确定这些是否有帮助、伤害或其他方面，但以下是我尝试过的（几乎没有成功）
我已将所有int
和unsigned int
替换为size\u t
。我知道这可能会稍微快一点，而且这些值无论如何都不应该小于0

我还使用了execution:：par_unseq
和std:：find_if
。我不确定这有多大帮助。“数组”通常有大约16-20列，但行数非常多（大约为50000000行或更多行）。由于std:：find_如果正在“扫描”最多只有几十个元素的单个行，那么并行化可能没有多大帮助

目标：
不幸的是，该算法需要非常长的时间才能运行。我最优先考虑的是速度。如果可能的话，我想把执行时间缩短一半
要记住的一些事情：
“数组”通常为~20
列乘以~50000000
行（有时更长）。它只有很少的0
，并且不能重新排列（“行”的顺序和每行中的元素很重要）。它占用（毫不奇怪）大量内存，因此我的机器资源非常有限
我还在cling
中运行解释为C++
。在我的工作中，我从来没有太多地使用编译的C++
。我试过编译，但没有多大帮助。我还尝试过使用编译器优化标志

可以做些什么来缩短执行时间（以牺牲几乎所有其他东西为代价？）
请让我知道我是否可以提供任何其他信息来帮助回答问题
这段代码看起来可能会受到内存带宽的限制，但我会尝试删除花哨的算法内容，以支持窗口计数。未经测试的C++：
#include <algorithm>
#include <cmath>
#include <vector>

using std::fabs;
using std::size_t;
using std::vector;

size_t NumCoincidences(const vector<vector<double>> &array,
                       double value_threshold, size_t num_columns) {
  static constexpr size_t kWindowSize = 4004000;
  const auto exceeds_threshold = [&](double x) {
    return fabs(x) >= value_threshold;
  };
  size_t start = 0;
  std::vector<size_t> num_exceeds_in_window(array[0].size());
  size_t num_coincidences = 0;
  for (size_t i = 0; i < array.size(); i++) {
    const auto &row = array[i];
    for (size_t j = 0; j < row.size(); j++) {
      num_exceeds_in_window[j] += exceeds_threshold(row[j]) ? 1 : 0;
    }
    if (i >= start + kWindowSize) {
      const auto &row = array[i - kWindowSize];
      for (size_t j = 0; j < row.size(); j++) {
        num_exceeds_in_window[j] -= exceeds_threshold(row[j]) ? 1 : 0;
      }
    }
    size_t total_exceeds_in_window = 0;
    for (size_t n : num_exceeds_in_window) {
      total_exceeds_in_window += n > 0 ? 1 : 0;
    }
    if (total_exceeds_in_window >= num_columns) {
      start = i + 1;
      std::fill(num_exceeds_in_window.begin(), num_exceeds_in_window.end(), 0);
      num_coincidences++;
    }
  }
  return num_coincidences;
}

#包括
#包括
#包括
使用std：：fabs；
使用std：：size\u t；
使用std：：vector；
大小\u t NumCoincidences（常量向量和数组，
双值（阈值，大小（列数）{
静态constexpr size\u t kWindowSize=4004000；
常数自动超过_阈值=[&]（双x）{
返回fabs（x）>=值_阈值；
};
大小\u t开始=0；
标准：：ve
        if(time_counter == time_threshold){
            row_itr -= (time_counter + 1);
            cache.clear();
        }

#include <algorithm>
#include <cmath>
#include <vector>

using std::fabs;
using std::size_t;
using std::vector;

size_t NumCoincidences(const vector<vector<double>> &array,
                       double value_threshold, size_t num_columns) {
  static constexpr size_t kWindowSize = 4004000;
  const auto exceeds_threshold = [&](double x) {
    return fabs(x) >= value_threshold;
  };
  size_t start = 0;
  std::vector<size_t> num_exceeds_in_window(array[0].size());
  size_t num_coincidences = 0;
  for (size_t i = 0; i < array.size(); i++) {
    const auto &row = array[i];
    for (size_t j = 0; j < row.size(); j++) {
      num_exceeds_in_window[j] += exceeds_threshold(row[j]) ? 1 : 0;
    }
    if (i >= start + kWindowSize) {
      const auto &row = array[i - kWindowSize];
      for (size_t j = 0; j < row.size(); j++) {
        num_exceeds_in_window[j] -= exceeds_threshold(row[j]) ? 1 : 0;
      }
    }
    size_t total_exceeds_in_window = 0;
    for (size_t n : num_exceeds_in_window) {
      total_exceeds_in_window += n > 0 ? 1 : 0;
    }
    if (total_exceeds_in_window >= num_columns) {
      start = i + 1;
      std::fill(num_exceeds_in_window.begin(), num_exceeds_in_window.end(), 0);
      num_coincidences++;
    }
  }
  return num_coincidences;
}