This week I refactor powering algorithms and add sliding window exponentiation implementation. It gave speed boost in couple of percent.
The next thing I do is add special squaring functions for small p values: powers of 2 (for 4 and 8); 3; 5; 7 (for details see the "Implementation of a New Primality Test" supplement). Selection of function going on unity_zp_sqr_inplace() and we need to have allocated memory for fmpz_t array (t).
It gives good speed boost (~10-15%), and for "small" numbers (near 100 digits) we catch up the PARI/GP implementation and I'm confident that we will soon become faster. For bigger values PARI/GP is still faster. Below the graphs with timings up to 100 and 300 digits:
I also looked at $(n - 1)$ test to replace computation in $\mathbb{Z}[\zeta_r]/(n)$ by $\mathbb{Z}/n\mathbb{Z}$, but I don't see that it will give a good speed gain.
Now I want to try reduce s value for fixed R (as in (4.) of "Primality Testing and Jacobi Sums"). After we compute s we can try to remove some factors without which we still have $s^2 > n$.
In total, over the past six weeks has been implemented APR-CL algorithm with good performance. Almost all of code is tested and documented. So now I need to finish special cases of multiplication and squaring and then try to optimize in general. Also we need to look closely at different initial values of R because the values from PARI/GP impair the performance (look at 70 digit numbers in first plot, for example).
The next thing I do is add special squaring functions for small p values: powers of 2 (for 4 and 8); 3; 5; 7 (for details see the "Implementation of a New Primality Test" supplement). Selection of function going on unity_zp_sqr_inplace() and we need to have allocated memory for fmpz_t array (t).
It gives good speed boost (~10-15%), and for "small" numbers (near 100 digits) we catch up the PARI/GP implementation and I'm confident that we will soon become faster. For bigger values PARI/GP is still faster. Below the graphs with timings up to 100 and 300 digits:
50-100 digits |
50-300 digits |
Now I want to try reduce s value for fixed R (as in (4.) of "Primality Testing and Jacobi Sums"). After we compute s we can try to remove some factors without which we still have $s^2 > n$.
In total, over the past six weeks has been implemented APR-CL algorithm with good performance. Almost all of code is tested and documented. So now I need to finish special cases of multiplication and squaring and then try to optimize in general. Also we need to look closely at different initial values of R because the values from PARI/GP impair the performance (look at 70 digit numbers in first plot, for example).
No comments:
Post a Comment