I've been on holidays for a couple of days, hence the break in transmission.

(For academics: the definition of a holiday is a day where you actually

stop working and do nothing work related for that whole day. I realise the

definition is not widely known, nor is it well-defined.)

Note, due to accidentally including some files in today's release that I

intended to hold over till next time, you have to do ./configure before

make check. This detects your CPU and links in any assembly code

that is relevant for it. More on this when I do the actual blog post about it.

It's time to start implementing quadratic algorithms (i.e. those that

take time O(n^2) to run, such as classical multiplication and division).

Before we do this, we are going to reorganise slightly. The file which

is currently called nn.c we will rename nn_linear.c to indicate that it

contains our linear functions. We'll also rename t-nn.c to t-nn_linear.c.

The macros in that file will be moved into test.h.

We also modify the makefile to build all .c files in the current directory

and to run all tests when we do make check. A new directory "test" will

hold our test files.

Finally, we add an nn_quadratic.c and test/t-nn_quadratic.c to hold the

new quadratic routines and test code that we are about to write.

The first routine we want is nn_mul_classical. This is simply a mul1

followed by addmul1's. Of course this will be horrendously slow, but once

again we defer speeding it up until we start adding assembly routines,

which will start happening shortly.

In a departure from our linear functions, we do not allow zero lengths.

This dramatically simplifies the code and means we do not have to check

for special cases.

There does not appear to be any reason to allow a carry-in to our

multiplication routine. An argument can be made for it on the basis of

consistency with mul1. However, the main use for carry-in's and carry-out's

thus far has been for chaining.

For example, we chained mul1's s*c and t*c for s and t of length m and

c a single word in order to compute (s*B^m + t)*c.

But the analogue in the case of full multiplication would seem to be

chaining to compute (s*B^m + t)*c where s, t and c all have length m. But

it probably doesn't make sense to chain full multiplications in this way

as it would involve splitting the full product t*c say, into two separate

parts of length m, which amongst other things, would be inefficient.

It might actually make more sense to pass a whole array of carries to our

multiplication routine, one for every word of the multiplier. However it is

not clear what use this would be. So, for now at least, we pass no carry-ins

to our multiplication routine.

We settle on allowing a single word of carry out. This may be zero if the

leading words of the multiplicands are small enough. Rather than write this

extra word out, we simply return it as a carry so that the caller can decide

whether to write it.

The classical multiplication routine is quite straightforward, but that plus the

rearrangement we've done is more than enough for one day.

The github branch for this release is here: v0.12

Previous article: v0.11 - generic test code

Next article: configure and assembly

## No comments:

## Post a Comment