This is the README file of gmp-ecm.

Table of contents of this file:
1. Files included in this distribution.
2. Main changes with respect to gmp-ecm 4c.
3. How to efficiently use P-1, P+1 and ECM?
4. Extra factors and Brent-Suyama's extension.
5. Memory usage.
6. Options -save and -resume.
7. How to get the best of gmp-ecm?
8. Command line options.
9. Known problems.
10. Record factors.

##############################################################################

1. Files included in this distribution.

COPYING - GNU GENERAL PUBLIC LICENSE
ChangeLog - changes with respect to previous version
INSTALL - instructions to install gmp-ecm
Makefile - to build the binary file 'ecm'
README - this file
auxi.c - auxiliary functions
bestd.c - routines to determine the best step 2 parameters
c155 - test file containing a 155-digit composite
ecm.c - factorization routine using the Elliptic Curve Method
ecm.h - header file for gmp-ecm
ecm2.c - ECM step 2 routines
getprime.c - dynamic Eratosthenes sieve
listz.c - arithmetic on lists of integers mod n
lucas.c - functions to evaluate Lucas sequences
main.c - main file for gmp-ecm, P-1 and P+1
memory.c - auxiliary routines to check memory allocation
mpmod.c - modular arithmetic
mul_lo.c - low-half short product
ntl.c - interface with NTL
pm1.c - Pollard P-1 factorization routine
polyeval.c - algorithm POLYEVAL
polyz.c - routines for polynomial arithmetic
pp1.c - P+1 factorization routines
resume.c - routines to resume a computation
stage2.c - common step 2 implementation for ecm, P-1 and P+1
test.ecm - test file for ECM
test.pm1 - test file for P-1
test.pp1 - test file for P+1
toomcook.c - Toom-Cook 3 and Toom-Cook 4 multiplication
tune.c - auxiliary routines to compute optimal thresholds

##############################################################################

2. Main changes with respect to gmp-ecm 4c.

- the code is split into several files to make the code easier to maintain
	and compile.
- the program now implements also P-1 and P+1, with a common step 2
	(use option -pm1 to call P-1, -pp1 for P+1).
- step 1 is slightly faster, thanks to a new modular multiplication layer
- step 2 uses Toom-Cook 3 and Toom-Cook 4, instead of only Karatsuba
	in gmp-ecm 4c.
- an option -save enables one to save the state at the end of step 1,
  to continue it afterwards, or on a different computer (with -resume).
- it is now possible to specify a range B2min-B2max for step 2. This
  enables one to perform a huge step 2 (for a given step 1) using several 
  computers: the first one checks B1-B2, the second one B2-B3, ...

##############################################################################

3. How to efficiently use P-1, P+1 and ECM?

The P-1 method works well when the input number has a prime factor P such
that P-1 is "smooth", i.e. has all its prime factor less or equal the 
step 1 bound B1, except one which may be less or equal the second step
bound B2. For P=67872792749091946529, we have P-1 = 2^5 * 11 * 17 * 19 *
43 * 149 * 8467 * 11004397, so this factor will be found as long as B1 >= 8467
and B2 >= 11004397:

$ echo 67872792749091946529 | ./ecm -pm1 -x0 2809890345 8467 11004397
GMP-ECM 5.0.3 [powered by GMP 4.1.2] [P-1]
Input number is 67872792749091946529 (20 digits)
Using B1=8467, B2=11004397, polynomial Dickson(6), x0=2809890345
Step 1 took 0ms
Step 2 took 170ms
********** Factor found in step 2: 67872792749091946529
Found input number N

There is no need to run several times P-1 with the same B1 and B2, like
for ECM, since a factor found with one seed will be found by another one.

The P+1 method works well when the input number has a prime factor P such
that P+1 is "smooth". For P=4190453151940208656715582382315221647, we have
P+1 = 2^4 * 283 * 2423 * 21881 * 39839 * 1414261 * 2337233 * 132554351, so
this factor will be found as long as B1 >= 2337233 and B2 >= 132554351:

$ echo 4190453151940208656715582382315221647 | ./ecm -pp1 -x0 2284918860 2337233 132554351
GMP-ECM 5.0.3 [powered by GMP 4.1.2] [P+1]
Input number is 4190453151940208656715582382315221647 (37 digits)
Using B1=2337233, B2=132554351, polynomial x^1, x0=2284918860
Step 1 took 1890ms
Step 2 took 1350ms
********** Factor found in step 2: 4190453151940208656715582382315221647
Found input number N

However not all seeds will succeed: only half of the seeds 's' work for P+1
(namely those where the Jacobi symbol of s^2-4 and P is -1.) Unfortunately, 
since P is usually not known in advance, there is no way to ensure that this 
holds. However, if the seed is chosen randomly, there is a probability of 
about 1/2 that it will give a Jacobi symbol of -1 (i.e. the factor P will 
be found if P+1 is smooth enough). A rule of thumb is to run 3 times P+1 
with different random seeds.

The ECM method is a probabilistic method, and can be viewed in some sense
as a generalization of the P-1 and P+1 method, where we only require that
P+t is smooth, with t random of order P^(1/2). The optimal B1 and B2 bounds
have to be chosen according to the (usually unknown) size of P. The following
table gives a set of near-to-optimal B1 and B2 pairs, with the corresponding
expected number of curves to find a factor of given size (this table does not
take into account the "extra factors" found by Brent-Suyama's extension, see
below).

       digits D  optimal B1      B2        expected curves N(B1,B2,D)
          15        2e3         1.2e5             30
          20       11e3         1.4e6             90
          25        5e4         1.2e7            240
          30       25e4         1.1e8            500
          35        1e6         8.4e8           1100
          40        3e6         4.0e9           2900
          45       11e6        2.6e10           5500
          50       43e6        1.8e11           9000
          55       11e7        6.8e11          22000
          60       26e7        2.3e12          52000
          65       85e7        1.3e13          83000
          70       29e8        7.2e13         120000

          Table 1: optimal B1 and expected number of curves to find a
	  factor of D digits.

Important note: the expected number of curves is significantly smaller
than the "classical" one we get with B2=100*B1. This is due to the
fact that this new version of gmp-ecm uses a default B2 which is much
larger than 100*B1 (for large B1), thanks to the improvements in step 2.

In summary, we advise the following method:

0 - choose a target factor size of D digits
1 - choose "optimal" B1 and B2 values to find factors of D digits
2 - run once P-1 with those B1 and B2
3 - run 3 times P+1 with those B1 and B2
4 - run N(B1,B2,D) times ECM with those B1 and B2, where N(B1,B2,D) is the
	expected number of ECM curves with step 1 bound B1, step 2 bound B2,
	to find a factor of D digits (cf above table)
5 - if no factor is found, either increase D and go to 0, or use another
	factorization method (MPQS, GNFS)

Note: if a factor is found in steps 2, 3 or 4, simply continue the current
	step with the remaining cofactor (if composite). There is no need
	to start again from 0, since the factorization effort on the cofactor
	is not lost.

##############################################################################

4. Extra factors and Brent-Suyama's extension.

GMP-ECM may sometimes find some
"extra" factors, such that one factor of P-1, P+1 or P+t exceeds the step 2
bound B2, thanks to Brent-Suyama's extension. Let explain how it works for P-1,
since it's simpler. The classical step 2 (without Brent-Suyama's extension)
considers s^(j*d) mod N and s^i mod N, where N is the number to factor, and
s the initial seed. Here, d is fixed, and the integers i and j vary in two
sets so that j*d+/-i covers all primes in [B1, B2]. Now consider a polynomial
f(x), and compute s^f(j*d) and s^f(i) instead of s^(j*d) and s^i [thus the
classical step 2 corresponds to f(x)=x^1]. Then P will be found whenever
all but one of the factors of P-1 are <= B1, and one factor divides one
f(j*d) +/- f(i):

$ echo 1207946164033269799036708918081 | ./ecm -pm1 -k 4 -power 12 286493 25e6
GMP-ECM 5.0.3 [powered by GMP 4.1.2] [P-1]
Input number is 1207946164033269799036708918081 (31 digits)
Using B1=286493, B2=25000000, polynomial x^12, x0=2997243583
Step 1 took 180ms
Step 2 took 400ms
********** Factor found in step 2: 1207946164033269799036708918081
Found input number N

Here the largest factor of P-1 is 83957197, which is 3.35 times larger than B2.
Warning: these "extra" factorizations may not be reproducible in future
versions of gmp-ecm, since they depend on some internal parameters that
may change.

The default polynomial used for a given B2 should be near optimal, 
i.e. give only a marginal overhead in step 2, while enabling extra factors.

##############################################################################

5. Memory usage.

Step 1 does not require much memory (about the same size as
the input number). Step 2 may be quite memory expensive, especially for
large B2, since its efficient algorithms use some large tables. To reduce
the memory usage of step 2, you may increase the 'k' parameter, which controls
the number of "blocks" performed in step 2. Multiplying the default value of
k (which is 5) by 4 will decrease the memory usage by 2. For example with 
B2=1e10 and a 155-digit number, step 2 requires about 28MB with the default
k, but only 14MB with k=32. Increasing k does, however, also increase the
time required for step 2 (see paragraph "7. How to get the best of
gmp-ecm?")

##############################################################################

6. Options -save and -resume.

These options are useful to save the current state of the computation after
step 1, or to exchange data with other software. It allows to perform
step 1 with gmp-ecm, and step 2 with another software (or vice-versa).

Here is an example how to reuse some P-1 computation:

$ cat c71
13155161912808540373988986448257115022677318870175067553764004308210487
$ ./ecm -save toto -pm1 -mpzmod -x0 2 5000000 < c71
GMP-ECM 5.0.3 [powered by GMP 4.1.2] [P-1]
Input number is 13155161912808540373988986448257115022677318870175067553764004308210487 (71 digits)
Using B1=5000000, B2=272993793, polynomial x^24, x0=2
Step 1 took 3230ms
Step 2 took 3220ms

The file "toto" now contains some information about the method use, the step
1 bound, the number to factor, the value X at the end of step 1 (in hexa-
decimal), and a checksum to verify that no data was corrupted:

$ cat toto
METHOD=P-1; B1=5000000; N=13155161912808540373988986448257115022677318870175067553764004308210487; X=0x12530157ae22ae14d54d6a5bc404ae9458e54032c1bb2ab269837d1519f; CHECKSUM=2287710189; PROGRAM=GMP-ECM 5.0.3; X0=0x2; WHO=zimmerma@mermoz.loria.fr; TIME=Tue Nov  4 14:42:56 2003;

Then one can resume the computation with larger B1 and/or B2 as follows:

$ ./ecm -resume toto 1e7
GMP-ECM 5.0.3 [powered by GMP 4.1.2] [ECM]
Resuming P-1 residue saved by zimmerma@mermoz.loria.fr with GMP-ECM 5.0.3 on Tue Nov  4 14:42:56 2003 
Input number is 13155161912808540373988986448257115022677318870175067553764004308210487 (71 digits)
Using B1=5000000-10000000, B2=732940912, polynomial x^24
Step 1 took 3170ms
Step 2 took 6110ms
********** Factor found in step 2: 1448595612076564044790098185437
Found probable prime factor of 31 digits: 1448595612076564044790098185437
Probable prime cofactor 9081321110693270343633073697474256143651 has 40 digits

The second run only considered the primes in [5e6-10e6] in step 1,
which saved half the time of step 1.

The format used is the following:
  - each line corresponds to a composite
  - a line contains assignments <id>=<value> separated by semi-colons ';'
  - possible values for <id> are 
    - METHOD (value = ECM or P-1 or P+1)
    - SIGMA (value = ECM sigma parameter) [optional]
    - B1 (first step bound)
    - N (composite to factor)
    - X (value at the end of step 1)
    - A (A-parameter of the elliptic curve)
    - CHECKSUM (internal value to check correctness of the format)
    - PROGRAM (program used to perform step 1, useful for factor credits) 
    - X0 (initial point for ECM, or initial residue for P-1/P+1) [optional]
    - WHO (who performed step 1) 
    - TIME (date and time of first step)
  SIGMA and X0 would be optional, and would be mainly be used in case of a
  factor is found, to be able to reproduce the factorization.
  For ECM, one of the SIGMA or A values must be present, so that the
  computation can be continued on the correct curve.

Note: it is allowed to have both -save f1 and -resume f2 for the same run,
however the files f1 and f2 should be different.

Remark: you should not perform in parallel several -resume runs on the same
input with the same B1/B2 values, since those runs will do the same 
computations. Options -save/-resume is useful in the following cases:

(a) somebody did a previous step 1 computation with another software
    which is faster than gmp-ecm, and wants to perform Step 2 with
    gmp-ecm which is faster for that task.
(b) somebody did a previous step 1 for P-1 or P+1 up to a given bound
    B1, and you want to extend that computation with B1' > B1, without
    restarting from scratch. Note: this does not apply to ECM, where 
    the smoothness property depends on the (random) curve chosen, not
    on the input number.
(c) you did a huge step 1 P-1 or P+1 computation on a given machine, and you
    want to perform a huge step 2 in parallel on several
    machines. For example machine 1 tests the range B2_1-B2_2, machine
    2 tests B2_2-B2_3, ... This also decreases the memory usage for
    each machine, which is function of the range width B2min-B2max.
    For the same reason as (b), this does not apply to ECM. 

##############################################################################

7. How to get the best of gmp-ecm?

Choice of modular multiplication. The ecm program may choose between 4 kinds
of modular arithmetic:

(1) Montgomery's REDC algorithm at the word level (option -modmuln).
    It is quite fast for small numbers, but has quadratic asymptotic
    complexity.
(2) classical GMP arithmetic (option -mpzmod).
    Has some overhead with respect to (1) for small sizes, but wins
    over (1) for larger sizes since it has quasi-linear asymptotic
    complexity.
(3) Montgomery's REDC algorithm at high level (option -redc).
    This essentially replaces each division by two multiplications.
    Slower than (1) and (2) for small inputs, but better for large
    or very large inputs.
(4) base-2 arithmetic for numbers dividing 2^n+1 or 2^n-1.
    Each division has only linear time, but the multiplication are
    most expensive since they are done on larger numbers.

What a "small" or "large" number means depend on the configuration.
The "tune" program helps to determine the thresholds between different
methods. Simply type "make tune", then "./tune".

The ecm program automatically selects what he thinks is the best
arithmetic for the given input number. If that choice is not optimal, you may 
force the use of a certain arithmetic by trying options -modmulm, -mpzmod, 
-redc. (The best choice should depend on B1 and B2 only very little, so long 
as B1 is not too small, say >= 1000.)

Number of step 2 blocks. The step 2 range [B1,B2] is divided into k 
"big blocks". The default value of k is chosen to be near to optimal.
However, it may be that for a given (B1,B2) pair, another value of k
may be better. Try for this to give the option -k <value> to ecm,
where <value> is 1, 2, 3, ... Below we give the experimental best
value for a 155-digit number, for some values of B1 and the
corresponding default value of B2.

      B1      optimal k
     1e6         4
     3e6         5
    11e6         5

Changing the value of the number of blocks will not modify the chance
of finding a factor (except for extra factors, but some will be lost,
and some will be won, so the balance should be even). However it will
increase the time spent in Step 2 (when less or greater than the
optimal value) and the memory used by Step 2 (see the paragraph 
"Memory usage").

Optimal gmp thresholds. The default configuration of gmp-ecm requires
only the default gmp installation, in particular it only needs the
gmp.h header file from GMP. This default configuration uses some
default thresholds (in particular for subquadratic multiplication 
and division) that may not be optimal on a given machine. To get
the optimal thresholds on your machine, you need the headers
files from the GMP build directory, and to add -DWANT_GMP_IMPL to
the C flags in Makefile.

##############################################################################

8. Command line parameters and options.

The usage of the "ecm" command is the following:

$ ecm [options] B1 [[B2min-]B2] < file

Parameters:
- B1 is the step 1 bound. It is a mandatory parameter. It can be
  given either in integer format (for example 3000000) or in
  floating-point format (3000000.0 or 3e6). The largest possible
  B1 value is 9007199254740996 for P-1, 4294967295 for ECM and P+1.
  All primes 2 <= p <= B1 are processed in step 1.
- B2 is the step 2 bound. It is optional: if not given, it is computed
  from B1 so that step 2 takes about half the time of step 1. Since the
  relative cost of both steps differs with the different methods, the
  default B2 value also differs; for the same B1, the default B2 for
  P+1 will be larger than that for P-1, and that for ECM will be still
  larger. Like B1, it can be given either in integer or in floating-point 
  format. The largest possible value of B2 depends on the number of blocks
  in step 2 (see option -k); it is about 9846466279650*k.
  All primes B1 <= p <= B2 are processed in step 2. If B2 < B1, no step 2
  is performed.
- alternatively one may use the B2min-B2max form, which means that all
  primes B2min <= p <= B2max should be processed. Thus specifying only B2 
  corresponds to B1-B2.

Options: options must appear before the parameters, and can be in any order.

Options to control the factorization method [ECM is the default]:
-pm1  Perform P-1 instead of ECM.
-pp1  Perform P+1 instead of ECM.

Options to control the group and initial point used:
-x0 x  [P-1,P+1,ECM]. Use x as initial point. x can be an arbitrary-precision
   integer or rational. For example, -x0 1/3 is valid. If not given, x0 is
   either generated from the sigma value for ECM, or at random.
-sigma s  [ECM]. Use s (arbitrary-precision integer) as curve generator.
   If not given, sigma is generated at random (32-bit integer). Any
   value has the same probability to hit a factor, the only important 
   thing is to ensure different values for different runs on the same number.
-A a  [ECM]. Use 'a' (arbitrary-precision integer) as curve parameter.
   If not given, is generated from sigma. Like sigma, the value of A itself
   is not important; what is important is to ensure different values
   for different runs.

Options to control step 2 parameters:
-k n  [P-1,P+1,ECM]. Perform n blocks in step 2 (default is 5). See
   paragraph "Memory usage" above.
-power n  Use x^n for Brent-Suyama's extension (see paragraph "4. Extra 
   factors and Brent-Suyama's extension). Default is chosen depending on
   method and B2. For P-1, n should be even. Does not work with P+1.
-dickson n  Use degree-n Dickson's polynomial for Brent-Suyama's
   extension, instead of x^n. As for x^n, 'n' should be even for P-1. 
   Dickson polynomials give a better chance of finding factors, but for
   P-1 and n>6, x^n is faster. Does not work with P+1.

Options to control output:
-q  Quiet mode. Found factors are printed on standard error, and
   cofactors (if not probable prime) are printed on standard output.
   This option is useful to deal with a file of cofactors: type
   ecm B1 < file > file2, so that non-factored numbers and composite
   cofactors are written in file2, while factors found are printed on
   standard error. Then one can do ecm B1 < file2 > file3, ...
-v  Verbose mode. Several informations are printed (modular
   multiplication used, initial A and x0 values, value of x at the end
   of step 1, parameters for step 2, and timing for sub-steps from step 2).

Options to control modular arithmetic (see "How to get the best of gmp-ecm?"):
-mpzmod   Use GMP's mpz_mod for modular reduction.
-modmuln  Use Montgomery's MODMULN for modular reduction.
-redc     Use Montgomery's REDC for modular reduction.
-nobase2  Disable special base-2 code. Base-2 division is used
          when the input number if a factor of 2^n+1 or 2^n-1,
          and the ratio of the number of bits is not too small.

Options for saving and resuming to/from files (see paragraph "6. Options
   -save and -resume"):
-save file    Save residues at end of step 1 to file.
-resume file  Resume residues from file, reads from stdin if file is "-".

Miscallenaous options:
-primetest  Perform a primality test on input number, and prints a
   message if it is probably prime. This is not done by default since
   the primality test may be quite expensive (especially for large
   input).
-c n  Perform n runs on each input number instead of just one by
   default. The runs are stopped before n in case a factor is found.
   This option is mainly useful for P+1 (for example with c=3) or for
   ECM, where n is the expected number of curves corresponding to a
   given B1 in Table 1 (see "How to efficiently use P-1, P+1 and
   ECM?"). This option is incompatible with -resume, -sigma, -x0.
   Giving -c 0 will emulate an infinite loop until a factor is found.

##############################################################################
	
9. Known problems.

(a) Under some operating systems (in particular Windows), ecm may use the
same random seed for several curves started during the same second.
This is due to the fact that the getpid() function returns a constant
value under those systems. This will cause an efficiency loss, since
the same computation will be done twice. A workaround is to wait at
least one second between different curves on the same number, or to
specify by hand a different seed using the -sigma or -x0 options.
In all cases, when running several curves on the same number, it is
safe to check that the sigma/x0 values are different for each run.

(b) Under some operating systems (for example FreeBSD), you may have
to comment the #include <alloca.h> line in ecm-gmp.h. If that still
does not work, then you need the GMP build file gmp-impl.h, and add
-DWANT_GMP_IMPL to CFLAGS.
	
##############################################################################

10. Record factors.

If you find a very large factor, the program will print a message like:

Report your potential champion to <email address>
(see <url for record factors>)

This means that your factor might be a champion, i.e. one of the top-ten
largest factors ever found by the corresponding method (P-1, P+1 or ECM).
Cf the following URLs:

ECM: ftp://ftp.comlab.ox.ac.uk/pub/Documents/techpapers/Richard.Brent/champs.txt
P-1: http://www.loria.fr/~zimmerma/records/Pminus1.html
P+1: http://www.loria.fr/~zimmerma/records/Pplus1.html







