COLUMBIA version 1.2 (c) 1995 CRTC of Systeme D
======== for Robot PD
QUESTIONS ABOUT COLUMBIA
1. "So what does Columbia do?"
Columbia is a file compressor. Feed in a file - program, word-processor
document, whatever - and it will produce a new, smaller file which still
manages to contain all the information of the original.
2. "How do I use the smaller files?"
You can't just load a "compressed" file into your word-processor: you need
to restore the original file first. To do this, you load Columbia, and tell
it to "decompress" the compressed file. It will then restore the file just
as it was before you first compressed it.
3. "Why should I want to compress files?"
Say, for example, that you store all your correspondence on disc. It's
unlikely that you'll need to read it frequently, but it's useful to have it
there to refer to. Compress all the files with Columbia, and you could save
50% of your disc space: the files can be restored any time you like.
If you regularly swap programs with CPC-owning friends, you can fit more
onto a disc by compressing them with Columbia: when they receive your disc,
they just load up their copy of Columbia and decompress the files.
4. "Can the files decompress automatically when I load them?"
Not with word-processor documents or BASIC programs, unfortunately.
However, if you have some technical knowledge, you can compress machine
code programs in such a way that they restore themselves to their original
state when loaded.
5. "How does it do it?"
Columbia works on the principle that in almost any file, there is a certain
amount of repetition. For example, in a file containing pure text, only a
certain number of characters are used: letters, punctuation, numbers. Yet
the CPC stores each of these in the space that has a capacity for up to 256
values. So, for example, by replacing each occurrence of the word "the"
with the otherwise unused value 128, you will save 2 characters (bytes) per
occurrence of the word.
DIFFERENT FILES
From your original file, Columbia can produce three types of compressed file.
These are:
- ARCHIVE: the standard type of compressed file.
- EXECUTABLE: a machine code program that can be loaded as usual.
If the original file is a machine code program, you can tell Columbia to
produce an executable file. This means that you can run the compressed
program like the original, and it will automatically uncompress itself on
loading. Some technical knowledge may be required to create executable
files.
- MULTI-RECORD - a set of original files, all compressed into one.
You can create a single compressed, "multi-record" file which contains
compressed versions of more than one original file. Say, for example, that
you had written a game which was made up of four files: the BASIC program,
some machine code, a hi-score table file, and a screen. To save space when
sending this program to a friend, you could combine these four into one
"multi-record" file: when you tell Columbia to decompress this file, it
will automatically recreate all the original files.
HOW TO CREATE ARCHIVE FILES
Once Columbia has loaded, you will see five icons at the foot of the screen.
You can select one of these with cursor keys and SPACE, COPY or ENTER. In
order, they are:
- Compress files
- Decompress files
- Set options
- Display file information
- Columbia copyright notice
To compress files, select the first icon. An up/down scrolling list of all the
files on the disc will be shown. Press COPY or SPACE to highlight the ones you
want to compress, and then press ENTER. You will be asked for a new filename
for each archive file which is created. We recommend that you use a file
extension such as .CMP, which will remind you that the file is compressed.
Make sure that you have enough space on the disc to store the compressed
files.
Decompression, using the second icon, is operated in the same manner.
The options icon lets you select source and destination drives if you have two
disc drives: use cursor keys (up, down, left, right) to change the options,
and COPY, SPACE or ENTER to finish.
- Note for 64k users: it is strongly recommended that you change the
"compression type" option here to "archive". The standard setting is
"automatic": this will create executable files from any binary data you
feed into Columbia, including screens and other graphics. Without 128k,
this is a slow process that requires reading the file in twice.
HOW TO CREATE EXECUTABLE FILES
If a machine code file is compressed when Columbia is set to "executable" or
"automatic" compression, Columbia will create a file which can be RUN directly
and decompresses into memory. The result of this on most machine code programs
will be a smaller file which works exactly the same way as before, except that
immediately after loading, a few seconds elapse while the program
decompresses. If you plan to use the executable file option, the following
notes may help.
- First of all, just because a file has been compressed by Columbia does not
necessarily mean that it will RUN correctly! As standard, an executable
file is set to load as high in memory as possible, taking up memory to
&A200 (near HIMEM). However, if your original program loaded near this
address, the file may not run correctly (as the decompression code will be
overwritten by the program code). To solve this, you can change the
"memory limit" before compressing a file, so that it loads lower or higher
in memory accordingly.
- Decompression requires just over 12k of workspace in memory. When loading
an executable file, you will usually see junk on the screen for a few
seconds: this is where the workspace is located by default. However, if
this is not suitable (for example, if you have compressed a screen!), you
can move the workspace somewhere else in memory using the "buffer
location" option. If the "buffer location" is set to "screen", the buffer
will be located in screen memory, and the screen will be cleared after
decompression.
- A file with an execution address of 0 will return after the decompression
routine is called, rather than jumping to the execution address. You may
also find the "file information" icon useful when creating executable
files.
HOW TO CREATE MULTI-RECORD FILES
These files contain compressed versions of two or more original files. To
create one, set "multi-record" as the compression type (using the options
icon). You can then select the "compress files" icon, and select every file
which you want included in the multi-record file. You only have to enter one
filename for all the original files to be compressed into.
Decompression is easy: all the original filenames are restored. To list these
names without decompressing the file, use the "file information" option.
It is recommended that you name your multi-record files with an .MRC (Multi-
Record Columbia) extension. This way, they will be instantly recognisable as
such.
COLUMBIA'S STRENGTHS AND WEAKNESSES
Columbia uses an improved version of the advanced compression algorithm known
as LZW (Lempel-Ziv-Welch), which is similar to that used on PCs and
Macintoshes for hard disc compression. Columbia's advantages are:
- Good savings should be achieved on almost all files: a 20k BASIC program
was reduced to 12k with Columbia. Graphics should also show a significant
improvement. Savings on machine code programs are less spectacular,
although still respectable.
- Text files compress particularly well - an average of 50% is saved.
These are the disadvantages:
- Compression is not particularly fast (although this version is around 25%
quicker than before), on a par with programs such as Crown's Cruncher.
- You can only create executable files from machine code programs - not from
BASIC ones.
INCORPORATING DECOMPRESSION ROUTINES INTO YOUR OWN PROGRAMS
Machine code programmers may want to use the decompression routine in their
own programs, so that they can access archive files without having to use
Columbia. Fully documented source code (Maxam format) is provided for this
purpose.
COLUMBIA REVISION HISTORY
- v1.0 (February 1995): original release.
- v1.1 (March 1995): multi-record support added, and some bugs removed.
- v1.2 (July 1995): speed improvements.
Richard Fairhurst (CRTC)
Robot PD Library
July 1995