The image shows call graph output of valgrind --callgrind tool for 100 digits prime:
So, we can see that to improve the speed of algorithm we must reduce the number of multiplications (which are well optimized in flint).
Currently I have a very naive implementation of selecting s and R values (just trying increase R = 2, 4, 6... until the proper s is found). It can takes a lot of time and choose non-optimal parameters. For example, if we set R = 98280 it proves 200 digits number primality for ~8 seconds instead of 30. By R = 166320 we can prove 300 digits number in ~30 seconds. So I need to rewrite aprcl_config_init() function.
The next thing to do is replace the condition s^2 > n to s^3 > n. Using "Divisors in Residue Classes" we can determine n divisors. It was implemented in flint before, so I need only add this step in Final division step.
Also I want to replace my basic powering algorithm in \mathbb{Z}[\zeta] / (n) by 2^k-ary method.
And finally we can precompute and store Jacobi sum in configuration structure (for 300 digits number computation of Jacobi sums takes ~40% of time) for testing numbers with a similar length.
This week I want to add this modifications and look in more detail the other opensource implementations.
No comments:
Post a Comment