C++ std:：tuple比std:：array快？_C++_Arrays_Performance_Vector_Tuples

C++ std:：tuple比std:：array快？

c++ arrays performance vector

C++ std:：tuple比std:：array快？,c++,arrays,performance,vector,tuples,C++,Arrays,Performance,Vector,Tuples,我有一个固定列数为4的二维数组，但行数非常大。我发现使用vector而不是vector，或者比vector快2倍，快5倍。尽管使用元组是一件很头疼的事。我定义了一个访问函数来绕过它： typedef tuple<int,int,int,int> Tuple; // access the i-th element inline int &tget (Tuple &t, int i) { switch (i) { case 0: return st

我有一个固定列数为4的二维数组，但行数非常大。我发现使用

vector

而不是

vector

，或者比

vector

快2倍，快5倍。尽管使用

元组

是一件很头疼的事。我定义了一个访问函数来绕过它：

typedef tuple<int,int,int,int> Tuple;
// access the i-th element
inline int &tget (Tuple &t, int i) {
    switch (i) {
        case 0: return std::get<0>(t); 
        case 1: return std::get<1>(t); 
        case 2: return std::get<2>(t); 
        case 3: return std::get<3>(t); 
    }   
}
vector<Tuple> array; 
// this gives the element in i-th row and j-th column
tget(array[i],j);

这里还有一个我使用的测试输入。第一行表示有T=18个独立的测试用例。每个测试用例的第一行是N（最多20个），然后是N个整数。给你。该算法具有指数增长，因此较小的数字并不意味着它只是几个操作。在测量时间时，我还考虑了I/O：

18
8
132391 123854 21536 19482 133025 10945 121800 10311
8                                                                                                                                                             
12 4 12 4 4 12 12 24
8
131723 16253 23309 132227 125171 12253 16757 136227
8
8 4 4 8 4 8 12 24
8
12 12 12 8 4 8 12 28
8
115021 114654 112093 17443 20371 17810 17274 115190
8
8 8 4 4 12 4 12 20
8
12 4 4 4 4 12 12 28
8
8 12 8 12 8 4 12 28
8
4 12 8 12 8 8 4 24
20
34836 56869 24112 27796 198091 196472 50364 27364 29572 53701 52981 25779 202644 204884 53840 30056 48705 53092 43751 18419
20
207447 26867 41968 35977 36384 57042 40981 30191 47705 26907 248361 26003 53715 48237 27691 192772 60507 58194 20754 245913
20
34836 56869 24112 27796 198091 196472 50364 27364 29572 53701 52981 25779 202644 204884 53840 30056 48705 53092 43751 18419
20
207447 26867 41968 35977 36384 57042 40981 30191 47705 26907 248361 26003 53715 48237 27691 192772 60507 58194 20754 245913
20
4 2 2 4 4 2 4 2 4 2 2 4 4 3 3 4 2 3 4 1
20
2 3 2 2 3 2 3 4 3 2 4 4 4 3 2 3 4 4 3 3
20
4 4 4 4 3 3 3 2 2 4 2 4 2 2 4 2 2 2 2 1
20
4 4 2 3 4 3 4 3 2 4 4 2 2 4 3 4 3 3 2 4

另一个问题与您的问题类似：

它有代码和基准

基本上，元组创建了一个结构，编译器能够比在元素数组上更好地优化操作

我的猜测是瓶颈是

运算符。您可以展示性能提升的基准测试，包括编译器和编译器标志吗？我对您的发现感到非常惊讶。您是否启用了优化？如果您没有进行基准测试，那么您如何确定它的速度是基准测试的两倍？你一定测量过，像“简要解释”这样的东西是不够的，你肯定需要提供。基本上，我们不相信你。相同大小的元组
和数组
应该是相同的速度，因此我们认为还有其他不同之处。对于元组，更多的操作/比较同时发生。正如我在回答中所说：结构上的操作比数组上的操作优化得更好。@RobertAndrzejuk你说得对，all操作符
18
8
132391 123854 21536 19482 133025 10945 121800 10311
8                                                                                                                                                             
12 4 12 4 4 12 12 24
8
131723 16253 23309 132227 125171 12253 16757 136227
8
8 4 4 8 4 8 12 24
8
12 12 12 8 4 8 12 28
8
115021 114654 112093 17443 20371 17810 17274 115190
8
8 8 4 4 12 4 12 20
8
12 4 4 4 4 12 12 28
8
8 12 8 12 8 4 12 28
8
4 12 8 12 8 8 4 24
20
34836 56869 24112 27796 198091 196472 50364 27364 29572 53701 52981 25779 202644 204884 53840 30056 48705 53092 43751 18419
20
207447 26867 41968 35977 36384 57042 40981 30191 47705 26907 248361 26003 53715 48237 27691 192772 60507 58194 20754 245913
20
34836 56869 24112 27796 198091 196472 50364 27364 29572 53701 52981 25779 202644 204884 53840 30056 48705 53092 43751 18419
20
207447 26867 41968 35977 36384 57042 40981 30191 47705 26907 248361 26003 53715 48237 27691 192772 60507 58194 20754 245913
20
4 2 2 4 4 2 4 2 4 2 2 4 4 3 3 4 2 3 4 1
20
2 3 2 2 3 2 3 4 3 2 4 4 4 3 2 3 4 4 3 3
20
4 4 4 4 3 3 3 2 2 4 2 4 2 2 4 2 2 2 2 1
20
4 4 2 3 4 3 4 3 2 4 4 2 2 4 3 4 3 3 2 4

struct my_struct {
    int a, b, c, d;
};

inline bool operator<(my_struct const& lhs, my_struct const& rhs) {
    return std::tie(lhs.a, lhs.b, lhs.c, lhs.d) <
        std::tie(rhs.a, rhs.b, rhs.c, rhs.d);
}

fun(std::array<int, 4ul> const&, std::array<int, 4ul> const&):
  mov ecx, DWORD PTR [rdi]
  cmp DWORD PTR [rsi], ecx
  lea rdx, [rsi+16]
  lea rax, [rdi+16]
  jg .L32
  jl .L31
.L25:
  add rdi, 4
  add rsi, 4
  cmp rax, rdi
  je .L36
  mov ecx, DWORD PTR [rsi]
  cmp DWORD PTR [rdi], ecx
  jl .L32
  jle .L25
.L31:
  xor eax, eax
  ret
.L32:
  mov eax, 1
  ret
.L36:
  cmp rdx, rsi
  setne al
  ret

fun(std::tuple<int, int, int, int> const&, std::tuple<int, int, int, int> const&):
  mov edx, DWORD PTR [rsi+12]
  cmp DWORD PTR [rdi+12], edx
  mov eax, 1
  jl .L11
  mov eax, 0
  jg .L11
  mov ecx, DWORD PTR [rsi+8]
  cmp DWORD PTR [rdi+8], ecx
  mov eax, 1
  jl .L11
  mov eax, 0
  jg .L11
  mov ecx, DWORD PTR [rsi+4]
  cmp DWORD PTR [rdi+4], ecx
  mov eax, 1
  jl .L11
  mov eax, 0
  jg .L11
  mov eax, DWORD PTR [rsi]
  cmp DWORD PTR [rdi], eax
  setl al
.L11:
  rep ret