Now that we have nn_mullow_classical, we can add nn_div_hensel. As explained,

this will take an integer a of length n and divide by an integer d of length

m modulo B^n, returning a quotient q and an "overflow".

The overflow will be two words which agree with the overflow from mullow(q*d).

As per the euclidean division, the dividend a will be destroyed.

The algorithm is somewhat simpler than the euclidean algorithm. If d1 is the

least significant word of d then we use an inverse mod B of d1 (dinv say) and

multiply it by the current word of the dividend being considered (working from

right to left) to get a quotient word q1.

We then subtract q1*d (appropriately shifted) from the dividend. There is no

adjustment to do as the inverse mod B is unique (so long as d is odd).

Any borrows and overflows from the subtractions are accumulated in the two

overflow and returned.

In our test code, we check a few things. Firstly, for an exact division, we

want that the quotient is really the exact quotient of a by d. As the

quotient returned is m words, which may be larger than the actual quotient,

we check that any additional words of q are zero. We do this by normalising q.

The second test we do is for an inexact division. We check that the the

overflow words turn out to be the same as the overflow from mullow(q*d).

Note that if we wish to make Hensel division efficient for doing an exact

division, say of a 2m - 1 by m division, we merely pass m words of the

dividend in instead of all 2m - 1 words, so that the quotient is also m words.

Once again we don't allow an overflow-in to Hensel division. This wouldn't

give us any kind of chaining property anyway.

Instead, we'd have to do a mulhigh(q, d) and subtract that from the high

part of the chain before continuing, and the mulhigh will accept our overflow

words from the low Hensel division.

In fact, we add a chaining test that does precisely this. We do a Hensel

division on the low n words of a chain, subtract a mulhigh from the high

m words of the chain, then compute the high m words of the quotient using

a second Hensel division.

To make our test code work, we add an ODD flag to randoms_of_len so that only

odd divisors are used with Hensel division.

It isn't clear if it makes sense to allow a carry-in at the top of div_hensel

or not. On the one hand, it might seem logical to allow a carry-in on account

of the way mul_classical works. On the other hand, divrem_hensel1 took a

carry-in, but at the bottom. This was for chaining rather than a read-in of

an extra word. We choose the latter convention, as it seems to make more

sense here and stops the code from becoming horribly complex.

The code for today's post is here: v0.17

Previous article: v0.16 - mullow_classical

Next article: v0.18 - printx_word, nn_printx

## No comments:

## Post a Comment