Suppose you want to calculate the sum of the squares of the difference of the objects:
$ \ Sum_ {I = 1} ^ {N-1} (x_i - x_ {i + 1}) ^ 2 $
The simplest code (input std :: vector & lt; double & gt; ; Xs
, output sum2
):
double sum2 = 0 ;; Double prev = xs [0]; For (vector :: const_iterator i = xs.begin () + 1; i! = Xs.end (); ++ i) {sum2 + = (pre - (* i)) * (Previous - (* i) ); // only 1 - with compiler customization prev = (* i); }
I hope the compiler optimizes in the above comment. If the length of N
is xs
, then you have N-1
and multiply 2N-3
The amount (the meaning of the meaning is +
or -
).
Now suppose you know this variable:
$ x_1 ^ 2 + X_N ^ 2 + 2 \ sum_ {i = 2} ^ {N-1} x_i ^ 2 $
and it is called sum
extending the binary class: $ sum_i ^ {N-1} (x_i-x_ {i + 1}) ^ 2 = Yoga
- 2 \ sum_ {i = 1} ^ {N-1} x_i x_ {i + 1} $
Then code becomes:
Double sum2 = 0.; Double prev = xs [0]; For (vector :: const_iterator i = xs.begin () + 1; i! = Xs.end (); ++ i) {sum2 + = (* i) * prev; Prev = (* i); } Sum2 = -sum2 * 2. + sum;
Here I have the N multiplication and the N -1 plus In my case n is about 100.
OK, compiling with G ++ - O2
I did not get any speed (I try to call the inline function 2M times), why?
In the period of execution, multiplication is more expensive than joints. In addition, the processor will be in parallel on the basis of extra and multiplication. To wit. This will start multiplying next, while doing addition to (see).
Therefore reducing the number of joints will not help much for performance.
What you can do is it is easy to vector your code for the compiler, or to create a vector by itself, to make the compiler easy to make vector, I regularly use the couple I will not use subscripts and not pointers.
Edit: N = 100 also can be a small number to see the difference in execution time. Try a n bigger.
The dirty code but Perf shows the improvement output:
1e + 06 59031558 1e + 06 18710703
The speed you get is 3x .
# & lt; Vector & gt; # Include & lt; Iostream & gt; using namespace std; Unsigned long int rdtsc (zero) {unsigned long-time X x; Unsigned one, d; __asm__ Unstable ("rdtsc": "= one" (a), "= d" (d)); Return ((unsigned long time) a). (((Unsigned long time) D) << 32) ;; } Double F (Standard :: Vector> Double> gt; & amp; xs) {Double sum2 = 0 ;; Double prev = xs [0]; Vector & lt; Double & gt; :: const_iterator iend = xs.end (); (Vector :: const_iterator i = xs.begin () + 1; i! = Iend; ++ i) {sum2 + = (pre - (* i)) * (Previous - (* i )); // only 1 - with compiler customization prev = (* i); } Return amount 2; } Double F2 (double * x, int n) {double sum2 = 0; For (Inti = 0; I
Comments
Post a Comment