Thursday 11 November 2010

BSDNT - v0.22 Windows support

Today's update is a massive one, and comes courtesy of Brian Gladman. At last we add
support for MSVC 2010 on Windows.

In order to support different architectures we add architecture specific files in the arch
directory. There are three different ways that architectures might be supported:

* Inline assembly code

* Standalone assembly code (using an external assembler, e.g. nasm)

* Architecture specific C code

Windows requires both of the last two of these. On Windows 64 bit, MSVC does not support
inline assembly code, and thus it is necessary to supply standalone assembly code for this
architecture. This new assembly code now lives in the arch/asm directory.

On both Windows 32 and 64 bit there is also a need to override some of the C code from the base
bsdnt library with Windows specific code. This lives in the arch directory.

Finally, the inline assembly used by the base bsdnt library on most *nix platforms is now in the
arch/inline directory.

In each case, the os/abi combination is specified in the filenames of the relevant files. For
example on Windows 32, the files overriding code in nn_linear.c/h are in arch/nn_linear_win32.c/h.
(Note win32 and x64 are standard Windows names for 32 and 64 bit x86 architectures, respectively.)

If the code contains architecture specific code (e.g. assembly code) then the name of the file
contains an architecture specifier too, e.g. arch/inline/nn_linear_x86_64_k8.h for code specific
to the AMD k8 and above.

It's incomprehensible that Microsoft doesn't support inline assembly in their 64 bit compiler
making standalone assembly code necessary. It would be possible to use the Intel C compiler on
Windows 64, as this does support inline assembly. But this is very expensive for our volunteer
developers, so we are not currently supporting this. Thus, on Windows 64, the standalone
assembly is provided in the arch/asm directory as just mentioned.

Brian has also provided MSVC build solution files for Windows. These are in the top level source
directory as one might expect.

There are lots of differences on Windows that requires functions in our standard nn_linear.c,
nn_quadratic.c and helper.c files to be overridden on Windows.

The first difference is that on 64 bit Windows, the 64 bit type is a long long, not a long. This
is handled by #including a types_arch.h file in helper.h. On most platforms this file is empty.
However, on Windows it links to an architecture specific types.h file which contains the
requisite type definitions. So a word_t is a long long on Windows.

Also, when dealing with integer constants, we'd use constants like 123456789L when the word type
is a long, but it has to become 123456789LL when it is a long long, as on Windows 64. To get
around this, an architecture specific version of the macro WORD(x) can be defined. Thus, when
using a constant in the code, one merely writes WORD(123456789) and the macro adds the correct
ending to the number depending on what a word_t actually is.

Some other things are different on Windows too. The intrinsic for counting leading zeroes is
different to that used by gcc on other platforms. The same goes for the function for counting
trailing zeroes. We've made these into macros and given them the names high_zero_bits and
low_zero_bits respectively. The default definitions are overridden on Windows in the architecture
specific versions of helper.h in the arch directory.

Finally, on Windows 64, there is no suitable native type for a dword_t. The maximum sized
native type is 64 bits. Much of the nn_linear, and some of the nn_quadratic C code needs to
be overridden to get around this on Windows. We'll only be using dword_t in basecase algorithms
in bsdnt, so this won't propagate throughout the entire library. But it is necessary to
override functions which use dword_t on Windows.

Actually, if C++ is used, one can define a class called dword_t and much of the code can
remain unchanged. Brian has a C++ branch of bsdnt which does this. But for now we have C code
only on Windows (otherwise handling of name mangling in interfacing C++ and assembly code
becomes complex to deal with).

Brian has worked around this problem by defining special mul_64_by_64 and div_128_by_64
functions on 64 bit Windows. These are again defined in the architecture specific version of
helper.h for Windows 64.

Obviously some of the precomputed inverse macros need to be overridden to accomodate these
changes, and so these too have architecture specific versions in the Windows 64 specific version
of the helper.h file.

In today's release we also have a brand new configure file for *nix. This is modified to handle
all the changes we've made to make Windows support easy. But Antony Vennard has also done
some really extensive work on this in preparation for handling standalone assembly on arches
that won't handle our inline assembly (and for people who prefer to write standalone assembly
instead of inline assembly).

The new configure file has an interactive mode which searches for available C compilers (e.g.
gcc, clang, icc, nvcc) and assemblers (nasm, yasm) and allows the user to specify which to use.
This interactive feature is off by default and is only a skeleton at present (it doesn't actually
do anything). It will be the subject of a blog post later on when the configure file is finished.

The code for today is found at: v0.22

Previous article: v0.21 - PRNGs

No comments:

Post a Comment