Professional Documents
Culture Documents
Abstract – An entire industry of Security is based on the and computer science. Algebraic number theory, finite
belief that factoring of large numbers is difficult. The fields, linear algebra, and real and complex analysis all
unbreakability of one of the most popular public key play vital roles in NFS. Four main steps of the NFS
cryptosystems RSA, relies on the notion that it is algorithm are Polynomial Selection, Sieving, Matrix step
computationally difficult to factor a large number into its prime
and the Square Root step. The Sieving and the Matrix
factors. This relation between factoring and cryptography is
one of the main reasons why people are interested in evaluating steps are the two most complicated, time-consuming and
the practical difficulty of the integer factorization problem. expensive steps.
Number Field Sieve (NFS) is the fastest known algorithm to
factor numbers larger than 110 digits and while the method The objective of this project is to implement the matrix or
still has many unexplored features that require further the linear algebra step and perform functionality testing
research, its development in the past few years has facilitated and timing comparisons of the implementations on
factoring of integers whose factoring was considered to be different platforms. The first step involves prototyping the
infeasible with today’s technology. Currently the limits of its Wiedemann and Block Wiedemann algorithms in magma.
factoring capabilities are around 633 decimal digits. In this
The second step involves developing code for the
project, I have developed in software, one of the most time
consuming stages of NFS - The Matrix step. The project Wiedemann and Block Wiedemann algorithms in C++
comprises of implementation and testing of the Wiedemann using LiDIA, a library for computational number theory
algorithm and the Block Wiedemann algorithm in magma and which provides a collection of optimized implementations
in C++ using LiDIA. for time intensive algorithms.
order to increase the speed. In the Block Wiedemann The next step involves invoking the Berlekamp Massey
algorithm multiple vectors are used for matrix algorithm to find the minimum polynomial. The minimum
multiplications. The number of vectors is usually 32 or polynomial is a polynomial m of smallest degree so that
64, in order to take advantage of the 32-bit or 64-bit m(B) = 0, where the latter is evaluated in the matrix ring
architectures of modern computers. Once again, we try to Mat(N x N, F2). It is a vector of size at most N-1 with
solve the equation Bω = 0, however, now the pre- each element a scalar € F2. The next step involves a
conditioning of the matrix B is performed by multiplying single vector left shift operation applied to the minimum
it with block vectors. Since it is made up of several polynomial.
scalars, it contains more information and this enables us
to compute the linearly recurrent sequence in fewer Finally, we invoke the polynomial evaluate function, that
computations. Thus, the Wiedemann algorithm performs evaluates the minimum polynomial and generates a vector
much faster when executed in a distributed or parallel ω[N] with each element a scalar € F2. Finally, we check
setting. if ω[N] satisfies the equation Bω=0, € F2N. If the vector
Bω is zero while the vector ω is non-zero, the program
returns the vector ω[N] and exits.
A. Solving Sparse Linear Systems
Given a matrix B in Mat (F2, N x N) (or over any other C. Block Wiedemann Algorithm
field for that matter), there are polynomials p so that p(B)
= 0, where the polynomial f is evaluated using the ring The Block Wiedemann algorithm consists of the
operations of Mat(F2, N x N). The polynomial of parallelization of the outer iterative loop in the
smallest degree with this property is the minimal Wiedemann algorithm. It seems natural to consider blocks
polynomial. Suppose that m(x) is the minimal polynomial. of vectors instead of single vectors in order to take
If lambda is an eigenvalue of B, then m(lambda)=0. Thus, advantage of 32-bit or 64-bit architecture of modern
if B is singular (i.e., has nontrivial solutions Bv = 0) then computers. Arithmetic is then performed on small square
v is an eigenvector with eigenvalue 0, and thus m(0)=0. matrices instead of the individual field entries. Since the
This means the constant term of m(x) vanishes. Now we components of the blocks can be processed in parallel, the
have, arithmetic necessary in an iteration can be carried out on
a parallel system as fast as the unblocked iteration on a
m(x) = cn xn + cn-1 xn-1 + cn-2 xn-2 +... + c1x
sequential computer.
No constant term co, or co = 0. Thus, The implementation of the Block Wiedemann algorithm
m(x) = x m’(x) is largely similar to the Wiedemann implementation. The
difference is that the two random vectors generated have
with m’(x) of degree one less than the minimum
elements that are integers in the range [0, 2m-1]. The rest
polynomial m(x). But we also know that m(B)=0, and
of the program then performs operations on vectors of
hence B m’(B) = 0. Therefore, for any vector v we have,
vectors.
0 = B m’ (B) v = Bω,
with ω = m’(B)v. Thus ω is a solution to Bω = 0, and if The input to the Block Wiedemann program is a square
we are lucky and ω is not 0, we are done. matrix B[N, N] € F2NxN where N is the dimension of the
square matrix. The output is a vector ω satisfying the
equation,
B. Wiedemann Algorithm
Bω=0, € F2N
The input to the Wiedemann program is a square matrix The matrix B is run through a function that operates in a
B[N, N] € F2N x N where N is the dimension of the matrix. while loop and only exits when the block vector ω is
The output is a vector ω[N] satisfying the equation, found.
Bω=0, € F2N The first step in an iteration of the while loop is the
generation of two random block vectors x and z where xT
The matrix B is run through a function that operates in a F2mxN and z F2Nxn. Then we find the linearly recurrent
while loop and only exits when the vector ω[N] is found. sequence vector,
a(i) = xTBiy, for 0 ≤ i ≤ L, where y = Bz
The first step in an iteration of the while loop is the
generation of two random binary vectors u[N], v[N] € F2. i.e. the block sequence vector of size L = N/m + N/n + є,
Then we find the linearly recurrent sequence vector uBiv є = 2n/m through a series of matrix multiplications. Each
for all 0 ≤ i ≤ 2N-1 and each element is a scalar € F2. element of the sequence is a matrix of size [m x n].
Factoring of Large Numbers using Number Field Sieve – The Matrix Step 3
We then define A(λ), the polynomial with matrix matrix g(λ) of size [m x (m+n)]. f(λ) and g(λ) satisfy the
coefficients of size [m x n]. equation,
A(λ) = ∑i a(i) λi A(λ) f(λ) = g(λ) mod λL
The third step is the most complex stage of the Block
Wiedemann algorithm. In this step we aim at finding the δj denotes the maximal degree of the column j. It acts as
linear generators of the matrix polynomial A(λ). The an upper bound on the degree of the coefficients in the jth
linear generators are n-dimensional column vectors of column of the polynomial f(λ). In each step of the
polynomials. The steps necessary to compute the linear algorithm, we aim at satisfying the equation:
generators of A(λ) are described in detail in the next
A(λ) f(λ) = g(λ) + λt e(λ)
section. It is found that the linear generators are column
vectors of size [n x 1]. where e(λ) is a matrix of size [m x (m+n)] and denotes the
error that has to be eliminated.
III. LINEAR GENERATORS FOR THE MATRIX Each column of f(λ), g(λ) and e(λ) will satisfy the
equation below. Let us call it condition C1.
SEQUENCE
A(λ) fj(λ) = gj(λ) + λt ej(λ),
The usage of the Berlekamp-Massey algorithm and the
deg fj ≤ δj deg gj ≤ δj deg fj ≤ L + δj – t
minPoly function in the Wiedemann implementation are
extremely straightforward as these operations are
performed on scalar elements. Another condition that must be satisfied is C2:
Rank ([λ0]e) = m
In the Block Wiedemann case, however, this is the most
[λ0]e denotes the coefficient of λ0 in the polynomial e,
complex and time consuming part. Thome in his paper
also known as the error or the constant term.
has described the original method proposed by
Coppersmith to find the linear generators for the matrix
sequence. This section borrows from his paper to give an B. Initialization
overview of the algorithm and its complexity.
Set t0 = m/n. Initially all maximal degrees δj for 1 ≤ j ≤
A. Framework (m+n) are set to t0 – 1. The first m columns of f are
randomly filled up to degree t0 – 1 and satisfy the
condition C1. The remaining [n x n] sub-matrix of f is set
In the block Wiedemann algorithm, the sequence a(i) is
to λt In. Gaussian elimination is used in order to reduce
0
e and P matrices. This results in elimination of the some Coppersmith’s generalized Berlekamp-Massey algorithm
of the coefficients of ([λ0]e, thereby reducing the error. is used which is not provided in the Magma package.
This is then repeated for the next row, i+1 and so on.
The Berlekamp-Massey function returns a minimal
Cancellation of remaining columns: At the end of this connection polynomial, and also the length L of the
process, if any of the columns still contain non-zero bits, polynomial which tells the number of elements necessary
then that column of matrix P is multiplied by λ which is to regenerate the sequence.
equivalent to shifting the non-zero bit to a position in the
next row with the same column index. Some of the operations used in this implementation:
This completes round t. At the end of round t we have the M := MatrixAlgebra(GF(2), n);
following values: // creates a square matrix M of size [n x n] with
f(t+1) = f(t)P(t) elements in F_2.
g(t+1) = g(t)P(t)
e(t+1) = e(t)P(t)(1/λ) v := [Random(GF(2)) : i in [1..n]];
// creates a randomly populated vector v of size n
with elements of the vector in F_2.
E. Termination
If a particular column j of [λ0]e is 0 at the beginning of a:= Matrix(GF(2), m, n, [0: i in [1..(m*n)]]);
round t+1, then fj(λ) is carried over to the next round t+1 // creates a matrix a of dimensions [m x n] with
without any change. If this repeats on several successive elements in F_2 initially set to 0.
rounds for the column j, then fj(λ) is considered likely to
be a linear generator for A(λ). MatrixSeq := [a : i in [1..L]];
// creates a matrix sequence of length L with each
The linear generator computed in the previous step yields element of the sequence being a matrix of type a [m x
a solution to the equation Bω=0, € F2N. The solution is n].
represented as,
δj The algorithm implementations in Magma helped to
ω = ∑ B δj – k z ([λk] fj) develop a framework for the implementations in LiDIA.
k=0
Rigorous testing was not performed in Magma, however,
where fj is a linear generator for matrix a(k). Here the it was useful to compare results from the LiDIA
subscript j is used to denote a single column in the matrix implementation to ensure accurate functionality on both
f and δj denotes the maximal degree of this column. platforms.
the function v.set_size(). The value of num may not In this project, I have attempted to generate test matrices
exceed pos. that meet this criterion. The matrices were generated in
such a way that the first few rows (about 0.1% of the total
math_vector operations used in this implementation: number of rows) had a lower sparsity factor than the
remaining rows of the matrix.
ct math_vector< T > (lidia_size_t c, lidia_size_t s)
constructs a vector with capacity c and size s With this approach, columns containing all-zeroes were
initialized with values generated by the default almost avoided. If such a column was generated, a 0 in
constructor for type T. this column index for a randomly chosen row was
ct math_vector< T > (const math_vector< T > & w) replaced with a 1. This row is usually chosen from one of
constructs a vector with capacity w.size initialized the first few rows in the matrix, once again to maintain a
with the elements of w. This operation, math_vector lower sparsity factor in the top rows. Another and more
< math_vector < T > >, creates a vector of vectors. realistic approach would be to simply discard that column
This structure is required for implementation of the and create another one.
Block Wiedemann algorithm where operations are
performed on a vector of 32 or 64 vectors (depending The generation of these matrices follows a very simple
on the bit size of the computer architecture). approach. For the first few rows of the matrix, 1’s are
void multiply (T & r, const math_vector< T > & v, placed at multiple positions (about 25% of the row) that
const math_vector< T > & w) stores the inner product are randomly selected. For the remaining rows a single 1
of v and w in r. is placed at a position that is randomly selected. All
math_matrix < T > is a form of base_matrix < T > that remaining columns of the row are filled with 0’s.
allows basic mathematical operations on and within
the matrix in addition to the basic access and In this project, I have worked with smaller sizes of the
initialization functions offered by base_matrix < T >. test matrices in order to demonstrate the functionality and
T is allowed to be either a built-in type or a class. efficiency of the software implementations in Magma and
LiDIA. Future work would involve testing in LiDIA with
math_matrix operations used in this implementation: standard sizes of the matrices. Besides the random
matrices, hard coded test matrices were also used in order
ct math_matrix< T > (lidia_size_t a, lidia_size_t b) to provide a comparison of the results of the prototype in
constructs an a × b matrix initialized with values Magma and the implementation in LiDIA.
generated by the default constructor for type T.
void multiply (math_vector< T > & v, const The rest of the program is a straightforward
math_matrix< T > & A, const math_vector< T > & w) implementation of the algorithm as described in the
assigns the result of the muliplication A·w to vector v previous sections.
(matrix vector multiplication).
void multiply (math_vector< T > & v, const
B. Tests on Wiedemann Implementation
math_vector< T > & w, const math_matrix< T > & A)
assigns the result of the muliplication w·A to vector v
(vector matrix multiplication). Testing is performed to determine the number of
iterations required to find the solution. Based on work in
the past, the algorithm is known to succeed in at most n +
V. EXPERIMENTS WITH THE WIEDEMANN 1 iterations with certain choices of the vectors u and v.
ALGORITHMS
Solutions to Bω = 0 for 100 random matrices A of
approximately the same average sparsity with 12 rows
A. Creation of Test Matrices and 12 columns were found. The tests were performed for
matrices that had no zero columns as well as those that
allowed all-zero columns.
In his paper, ‘Finding column dependencies in sparse
matrices over GF(2) by block Wiedemann’, O. Penninga
The frequency F denotes the number of matrices for
talks about having worked with test matrices that do not
which solutions were found in a specified range of the
closely resemble the factorization matrices. The output of
number of iterations.
the sieving step produces matrices in which the first few
rows are much denser. This means that the first few rows
The cumulative frequency C.F. gives the number of
contain many more 1s, than the other rows. The first 100
matrices for which solutions were found in at most the
rows of a factorization matrix with 100,000 rows and
number of iterations indicated by the upper-bound of that
columns may contain 25% of all the 1s.
range.
Factoring of Large Numbers using Number Field Sieve – The Matrix Step 6
The following tables summarize the results for matrices such as math_matrix and math_vector multiplications, the
with average column weights 1, 2 and 3. time taken for the computations was greatly reduced.
linear generators for matrix sequences. However, there [6] G. Villard, “Further analysis of Coppersmith’s Block
are several other algorithms to implement the sequential
Wiedemann algorithm for the solution of sparse linear
stage of Coppersmith’s Block Wiedemann algorithm.
Kaltofen provides an analysis of these algorithms in his systems”.
paper. There is the Block Toeplitz algorithm,
[7] Erich Kaltofen, “Analysis of Coppersmith’s Block
Beckermann and Labahn’s algorithm that uses fast
Fourier transforms, and many others that provide an Wiedemann algorithm for the parallel solution of
increased speed up. It would be interesting to develop
sparse linear systems”.
these algorithms in software and make a comparative
analysis. [8] Chandana Anand, Arman Gungor, Kimberly A.
Thomas, “Factoring of Large Numbers using Number
After the successful completion of the implementation
and analysis of the Wiedemann and Block Wiedemann Field Sieve – The Matrix Step”
algorithms and understanding the difficulty of factoring
[9] Matthew E. Briggs “An introduction to General
large numbers in software, an obvious next step would be
the implementation and analysis of other algorithms that NumberFieldSieve”
can be used to solve the Matrix step. One such example is
http://scholar.lib.vt.edu/theses/available/etd-32298-
the Lanczos algorithm along with its block version.
93111/unrestricted/etd.pdf
[10] Number Field Sieve – Mathworld
VII. CONCLUSION
http://mathworld.wolfram.com/NumberFieldSieve.htm
The approach in this paper has been significantly l
influenced by Coppersmith’s generalized version of the
[11] General Number Field Sieve – Wikipedia
Berlekamp Massey algorithm. By using a block of vectors
in place of a single vector it is possible to parallelize the http://en.wikipedia.org/wiki/General_number_field_si
outer loop of the iterative methods for solving sparse
eve
linear systems. This can significantly increase speed-up of
the matrix step. [12] Brillhart J., Lehmer D. H., Selfridge J., Wagstaff S. S.
Jr., and Tuckerman, Factorizations of b^n+/-1, b==2,
I would like to thank Dr. Patrick Baier who helped me
understand the algorithms and Dr. Kris Gaj for his 3, 5, 6, 7, 10, 11, 12 Up to High Powers, rev. ed.,
support and inspiration. I am also grateful to Dr.
American Math Society, Providence, RI, 1988,
Emmanuel Thome for his valuable suggestions.
http://www.ams.org/online_bks/conm22
[13] Lenstra, Arjen K.; Lenstra, H.W., Jr. (Eds.) “The
REFERENCES
development of the number field sieve (Lecture Notes
[1] Magma Manual enigma.gmu.edu in Math)”, Springer-Verlag, 1993.
[2] LiDIA Manual [14] A.K. Lenstra, H.W. Lenstra, M.S. Manasse, and J.M.
http://www.cdc.informatik.tu-darmstadt.de/TI/LiDIA/ Pollard, "The Number Field Sieve",
[3] Emmanuel Thome, “Fast Computation of Linear http://www.std.org/~msm/common/nfspaper.ps
Generators for Matrix Sequences and Application to [15] A.K. Lenstra, H.W. Lenstra, M.S. Manasse, and J.M.
the Block Wiedemann Algorithm”. Pollard, "The Factorization of the Ninth Fermat
[4] O. Penninga, “Finding column dependencies in Number",
sparse matrices over $mathbb{F}_2$ by block http://www.std.org/~msm/common/f9paper.ps
Wiedemann”. [16] B. Murphy and R. P. Brent, “On quadratic
http://ftp.cwi.nl/CWIreports/MAS/MAS-R9819.pdf polynomials for the number field sieve”, Australian
[5] WLSS Manual Computer Science Communications 20, pp. 199-213,
1998.
Factoring of Large Numbers using Number Field Sieve – The Matrix Step 8