Today's post is a discussion only, without accompanying code. This is because

the topic is divrem using the "school boy" algorithm, and this is not as

straightforward as one might imagine. The discussion below is informal, and

may contain errors. Please let me know if so and I can make some adjustments

before releasing the code for the next article, where we implement this.

As a consolation, I have released a failed attempt at using C89 macros to make

even more simplifications to our test code. In particular, I tried to make the

chain tests more "automatic". Unfortunately, this didn't work out. It ended up

making the test code too complicated to use and it was going to be way too much

work to convert all of it over to the new test system.

This test code "simplification" was completely abandoned and will not be merged.

But if you are curious you can see it here: gentest

In particular, take a look at generic.c/h and the test code in t-nn.c.

Anyhow, back to today's discussion:

Firstly, what exactly do we mean by the school boy division algorithm?

Certainly I leaned how to do 39 divide 7 at school, and even 339 divide 7.

But how about 10615 divide 1769?

The first step is to divide 1769 into 1061. It doesn't go, so we grab another

digit. So it becomes 1769 into 10615. I have no idea how many times it goes!

Either I missed something at school when I studied "long division", or I only

think I learned an algorithm suitable for problems like this.

In fact, the only way I know to do this division is trial and error, i.e.

to go through all the likely quotient candidates from one to nine.

Using a calculator I find it goes 6 times.

I think I intuitively know how to proceed from here. We place the digit 6

into our quotient, multiply 1769 by 6 and subtract to form a remainder.

But now we have a problem. What if we are dividing 17699949 into 106150000.

Does it still go 6 times? In fact it does not. But how would I know this

without being able to divide 17699949 into 106150000 in the first place!

So I have two problems. Firstly, if I simply truncate the numerator and

denominator to a certain number of digits, even the first digit of my

quotient may be wrong, and secondly, if I use more digits in the first place

my intiuitive "algorithm" doesn't help. It basically says, guess and refine.

That's all very well, but when my digits are 64 bit words, I may not be so

lucky with the guessing thing.

Now an interesting thing to note in the example above is that no matter how

many digits we add to the dividend, 1769 will always go into the first 5

digits 10615 of the dividend, 6 times (with some remainder), no more, no

less.

To see this, we note that 6*1769 is less than 10615 and 7*1765 is greater.

So the only way we could get 1765 to go in 7 times would be to increase

those first five digits of the dividend. Any following digits are irrelevant

to that first digit, 6, of the quotient.

This is a general feature of integer division, and also applies for the

word based (base B) integer division problem.

However, adding digits to the divisor is not the same, as the example above

shows. In fact 17699949 only goes 5 times into 106150000.

Now the question is, how far off can it be if I truncate the dividend and

divisor? How bad can it get?

We can answer this as follows. Note that 17699949 < 17700000. So 10615xxxx

divided by 1769yyyy is greater than or equal to 10615 divided by 1770, which

happens to be 5. So the answer has to be either 5 or 6 times.

What I have done here is simply add 1 to the first four digits of the

divisor, to get a lower bound on the first digit of the quotient. In this

way, I can get a small range of possible values for the first digit of the

quotient. Then I can simply search through the possibilities to find the

correct quotient digit. This matches my intuition of the long division

algorithm at least.

It seems that *perhaps* a quotient digit obtained in this way will be at

most out by 1. In other words, roughly speaking, if X is the first few digits

of the dividend and Y the appropriate number of digits of the divisor (in the

example above, X = 10615 and Y = 1769), then perhaps X / Y >= X / (Y + 1)

>= (X / Y) - 1, where by division here, I mean integer division. Let's call

this condition R.

Let's suppose now that we are working on integers written as binary words

instead of decimal digits. Let's also suppose that X / Y < B. Let's call

this condition S.

Well, it is easy to generate a counterexample to condition R. Let Y = 2,

X = B. Clearly R is not satisfied, even though S is.

But this seems to only be a problem because Y is so small. So, what if we

constrain Y to be at least B/2?

It turns out that condition R still doesn't hold. A counterexample is

Y=B/2, X = B*Y - 2.

However, I claim that X / (Y + 1) >= (X / Y) - 2 does hold, so long as

condition S holds. Let us call this inequality T.

Clearly there does not exist a counterexample to T with X <= Y.

Let us suppose that for a certain Y >= B/2, there is a counterexample to T.

Clearly if X0 is the first such counterexample, then X0 is a multiple of Y,

but not of Y + 1. Nor is X0 - 1 a multiple of Y + 1 (else X0 - 2 would have

been a counterexample).

It is also clear that every multiple of Y from X0 on is a counterexample, as

the left hand side can never "catch up" again.

Thus we have the following result: if there exists a counterexample to T for

some Y >= B/2 and X < BY, then X = (B - 1)*Y is a counterexample.

Substituting this value of X into T yields that T holds if and only if

(B - 1)*Y / (Y + 1) >= B - 3 for all Y >= B/2. Thus, if we can show that this

last inequality holds for all Y >= B/2 then we have proven T.

Note that this inequality is equivalent to (B - 1)*Y >= (Y + 1)*(B - 3), i.e.

2*Y >= B - 3. However, as Y >= B/2, this holds under our hypotheses.

So putting all of the above together, suppose we are dividing A by D and that

Y >= B/2 is the leading word of the divisor D and B*Y > X >= Y is the first

word or two words of the numerator A (whichever is appropriate). Then if Q0

is the leading word of the quotient Q = A / D, then we have shown that

X / Y >= Q0 >= (X / Y) - 2.

In other words, X / Y can be no more than 2 away from the first word of the

quotient Q = A / D that we are after.

This leads us immediately to an algorithm for computing a quotient of two

multiprecision integers.

(i) Let Y be the first WORD_BITS bits of the divisor D, so that

B > Y >= B/2.

(ii) Let X be the first word or two words (appropriately shifted as per Y) of

A such that B*Y > X >= Y.

(iii) Let Q0 = X/Y.

(iv) Set R = A - D*Q0 * B^m (for the appropriate m), to get a "remainder".

While R < 0, subtract 1 from Q0 and add D * B^m to R.

(v) Write down Q0 as the first word of the quotient A / D and continue on by

replacing A by R and returning to step (i) to compute the next word of the

quotient, etc.

Another algorithm can be derived as follows. Instead of using Y >= B/2 in the

above algorithm, let's choose Y >= B^2/2. A similar argument to the above

shows that X / (Y + 1) >= (X / Y) - 1 for Y >= B^2/2 and B*Y > X >= Y. It boils

down to showing that (B - 1)*Y / (Y + 1) >= B - 2 for Y >= B^2/2, i.e. that

Y >= B - 2, which is clearly satisfied under our hypotheses.

The algorithm is precisely the same, but at step (iv) we can replace the while

statement with an if statement and perform at most one adjustment to our

quotient Q0 and to R.

Now we return to step (v), in which we said that we could just continue

on from step (i) to compute the next word of the quotient Q. If we do this

and set A = R then compute the new X, what we would like is something like

the divrem1 algorithm we implemented, where, (after possibly some kind of

initial iteration that we handled specially), it is always true that the new

X is two words and has its first word reduced mod Y.

However, this does not follow from 0 <= R < D*B^m, and it may be that the

first word of the remainder is equal to Y! This is due to the truncation of

R to get Y.

It is clear from 0 <= R < D*B^m that if X is the first two words of the

remainder that X <= B*Y. So to make the algorithm continue in the way we'd

like, we only have to deal with the special case where X = B*Y.

We know that in the first algorithm above, the next word of the quotient may

be B - 1 or B - 2, since we know already that it is not B. We must multiply

out and check which it is.

In the second algorithm above, where Y >= B^2/2, the only possibility for

the next quotient word is B - 1, as we know it is not B.

In the next article we will look at implementing the first of the two

algorithms above, leveraging the code we already have for computing and using

a precomputed inverse.

Previous article: v0.13 - muladd1, muladd

Next article: v0.14 - divrem_classical

## No comments:

## Post a Comment