The unofficial MO3 format description

open source MO3 decompression and decoding support (but not playback)


News: See


Version 0.92
May 18th, 2014

0. Credits

1. Introduction

This description is applicable to mo3 encoder v1.8, v2.1, v2.2 and v2.4.

1.1 Overview

The MO3 format means "MOdule with MP3", because the main initial idea was to reduce the size of a module (in .mod, IT, XM) by compressing the samples using MPEG audio layer 3.

This format has been created by Ian Luck (http://www.un4seen.com/).

But not only the samples are compressed, but also the music data, containing mainly notes, instruments number and effects, as well as instruments information.
And a lot of effort has been used to reduce the size of this part. Then a specific lossless compression is applied on music data.

The samples can be compressed using OGG, MP3, and 2 kind of specific lossless algorithms.

We can define "compression" a scheme which detects all kind repetition in some data, then encodes these repetition in a more compact way. This principle is very well applied in MO3 for music data encoding.

1.2 The music data size reduction

The first idea of the MO3 encoder is to parse the music data and detect unused samples, then they are removed from the module.

When composing a module, musicians often cut and paste the content of a channel (later called a "voice") from one pattern to another one : for example drums, and bass. So this kind of repetition is detected by the encoder : a list of unique voice is built, then the patterns are encoded using pointers onto these unique voice data. This idea is used in mtm format.
This is very efficient especially for the "empty" voice : for example in a module with 4 channels, often only 3 voices are used per pattern. So the "empty" voice is repeated a lot of time.

In a voice, one row can be repeated several times. This is true for the empty row. This kind of repetition is detected and stored in a compact way.

The last size optimisation is how the row in a voice data is encoded, using a list of type/value items.

1.3 The sample delta encodings

Digitized audio data are generally signed values stored in 8bits or 16bits. If directly compressed using general purpose lossless compression algorithms, best result are about 10% of reduction, which is poor.
Audio data are roughly sinus data (or sums of it), so with few repetition.
But the successive values are close to each other, so a good idea is first encode them as delta values, (the first being 0 for example) before compression. This is done in the 'delta' version of the MO3 lossless packers, and in XM modules.
But there is more smart as the mathematical slope of audio data is often constant : encode the error for the next predicted delta, instead of directly the delta value. Then the 'prediction' is adjusted with the error : so the prediction is converging to the right 'next' delta value.
This method is more efficient than simple delta one, especially on 16 bits data.

2. The file format

2.1 Compressed form

0x10 is the notation of an hexadecimal value (16 in decimal)
short (2 bytes) and long (4 bytes) are stored in file in little endian order (intel x86)

Address Length Type Description
0x0000 3 char "MO3"
0x0003 1 byte version (0 with mp3 and lossless, 4 with v2.1, 1 with ogg) related with sample compression, 4 should means "with no LAME header".
3 means v2.2 and 5 v2.4
0x0004 1 long uncompressed length of header (music data)
Encoder version 2.2 and earlier (version == 0, 1, 3 or 4):
0x0008 computed byte[] compressed header (see 2.2 and 2.3 section)
computed computed byte[] samples, compressed or not, using lossless, mp3 or OGG
Encoder version 2.4 (version == 5):
0x0008 1 long data offset in compressed data after decompression
0x000c computed byte[] compressed header (see 2.2 and 2.3 section)
computed computed byte[] samples, compressed or not, using lossless, mp3 or OGG

2.2 The music data decompression algorithm

Here is Matthew explanation:

"The first byte is always uncompressed. After that, you've got two interleaved streams of control bytes and data bytes. The control bytes are read by the shift_dl routine.
In the unpack routine, the control bits are read most-significant first.
A zero bit indicates "uncompressed byte". A one bit indicates compressed data.
The next two control bits control which kind of compression
-- if they are '00' it's LZ with the same (relative) pointer as a previous LZ.
The next two bits of the control stream are the length, unless they are both zero.

If they are both zero, the true length minus 2 is encoded in the control stream, two bits per bit.
The first bit in each pair is the actual data, the second bit is 0 on the last pair.

If the first control bits are '11', '10' or '01', then the LZ pointer is in the control stream.
The most significant bit of the pointer is a '1', then the next most significant bits of the pointer are read from the control stream two bits at a time as described above (including the initial 11 or 01 or 10). Then 3 is subtracted from that value and it is shifted left by 8 bits, and the 8 least significant bits if the pointer are taken from the data stream. The one's-complement of the result is taken.
The length adjustment for -1280 and -32000 is saved and added back in later (it's always at least one). Then it goes into the same LZ as before, with the next two bits of the control stream being the length unless they are both zero, etc.

Example:

64 6d 08 69 61

64 = 01100100
0 = next byte is literal 0x6d
1 = compressed data
10 = LZ with MSB of pointer zero after subtracting 3

08 -- byte from data stream, pointer to -9 bytes back (points to the 'a' in Danny)

01 -- from control stream, a length of 1, plus the adjustment 1 from earlier = 2.
0 -- indicates a literal 69
0 -- indicates a literal 61. "

For more details, look in the source code here.

This algorithm is copyrighted Ian Luck, and is also used in PEtite.

2.3 The music data, after decompression

2.3.1 General data

Address Length Type Description
0x0000 variable char[] song name (terminated by 00)
computed variable char[] message (in IT, terminated by 00)

then, 0x1a6 bytes :

0x0000 1 byte number of channel (for example 04 for .mod, 0x20 for .xm)
0x0001 1 short song len (at 0x3b6 in .mod, at 0x40 in .xm)
0x0003 1 short restart position
0x0005 1 short pattern number
0x0007 1 short unique voice number
0x0009 1 short instrument number
0x000b 1 short sample number
0x000d 1 byte ticks/row
0x000e 1 byte initial tempo (default = 125)
0x000f 1 long flags
 if (mo3Hdr->flags & 0x0100)
       printf("IT");
       else if (mo3Hdr->flags & 0x0002)
        printf("S3M");
       else if (mo3Hdr->flags & 0x0080)
        printf("MOD");
       else if (mo3Hdr->flags & 0x0008)
        printf("MTM");
       else
        printf("XM");
        
bit#0 : 1 means linear frequency table, 0 means Amiga table (cleared in .mod)
bit#14: (0x00004000) currently used
bit#17: (0x00020000) always set
examples: 0x00024001 for .xm, 0x00020088 for .mod
0x0013 1 byte global volume
0x0014 1 byte pan separation
0x0015 1 byte internal volume (could be ignored)
0x0016 64 byte[] default channel volumes (for 64 channels).
0x0056 64 byte[] default channel panning (for 64 channels).
0x0096 16 byte[] IT MIDI macros : SF0-SFF settings (equate to "F0F0<value-1>z" in IT)
0x00a6 128*2 byte IT MIDI macros : Z80-ZFF settings (2 bytes each, equate to "F0F0<value1><value2>")

then :

2.3.2 Song and pattern data

Address Length Type Description
0x0000 songlen byte[] song sequence (pattern #)
computed nb unique voice short[] voice seq (for each pattern, and each channel, number of voice data) : identical voices are detected and factorized at compression.
computed nb pattern short[] pattern length table (size=nb_patt*2)

In .mod each pattern has 64 row, and 4 channels (for protracker). Each row is coded using 4 bytes. So a voice takes 4*64 bytes, and a pattern 4*64*4 bytes.
In MO3, the number of row per pattern is variable (like in XM) and stored in 'pattern length table'. To rebuild a pattern you have to use the voice seq table. The voice data are stored as described above :

For each voice data, repeated "nb unique voice" times.

0x0000 1 long voice data encoded as type/value list
(one empty voice is encoded "len=7, 10 f0 f0 f0 f0 30 00" or "len=5, 10 f0 f0 10 00" depending of pattern length by v1.8 and v2.1
with v2.2 encoder, the empty voice is really empty : only the ending 00)

The first byte is coding both the length of the type/value list (using the 4 right most bits) and if this row is repeat or not (using the 4 left most bits).
For example 0x30 means "3 times a empty row", and 0x32 means "this row list has 2 type/value, and is repeat 3 times".

type value type description value value Description
1 note note number 0=C-0, 1=C#0, 2=D-0, 3=D#0, 4=E-0, ..., 0x58=E-6, 0x59=F-7, 0x5a=F#7, 0x5b=G-7.
0xff means "note off, ==", 0xfe means "^^"
2 instrument intrument number-1 the intrument 1 is coded "0".

MOD/MTM  effects
type value type description value value Description
3 0 effect parameter arpeggio
4 1 effect parameter portamento up
5 2 effect parameter portamento down
6 3 effect parameter tone portamento
7 4 effect parameter vibrato
8 5 effect parameter volume slide + tone portamento
9 6 effect parameter volume slide + vibrato
0xc 9 effect parameter set offset
0xd A effect parameter volume slide
0xf C effect parameter set volume
0x10 D effect parameter pattern break
0x11 E effect + effect parameter extended effect (E0->EF)
0x12 F effect parameter set speed
One good example (almost all effects in a great music) is Danny Elfman by Moby
XM effects
type value type description value value Description
4 1 effect parameter portamento up
6 3 effect parameter tone portamento
7 4 effect parameter vibrato
0xb p effect parameter set panning. 'p02' is coded '0b 00', 'p62' '0b f0', 'p10' '0b 20'
0xd A effect parameter volume slide
0xf v effect parameter set volume
0x10 D effect parameter pattern break
0x11 E effect parameter pattern delay
0x12 F effect parameter set speed
0x14 c effect parameter volume slide up. 'c03' is coded "14 30"
0x15 b effect parameter fine volume down
0x16 G effect parameter set global volume
IT effects
type value type description value value Description
6 G effect parameter tone portamento
7 H effect parameter vibrato
0xb X effect parameter set panning
0xf v effect parameter set volume
0x22 D effect parameter volume slide
0x22 K effect parameter volume slide + vibrato
0x28 M effect parameter set channel volume
0x30 a effect parameter fine volume up
0x30 b 8 + effect parameter fine volume down
0x30 d (effect parameter)<<4 + 0xf volume slide down. 'd01' is '0x30 0x1f'
S3M effects
type value type description value value Description
6 G effect parameter tone portamento
7 H effect parameter vibrato
7 and 0x22 K effect parameter vibrato + volume slide (K 01 is coded "07 00 22 01")
0xa R effect parameter tremolo
0xc O effect parameter set offset
0xf v effect parameter set volume
0x10 C effect parameter pattern break
0x12 T effect parameter set tempo
0x16 V effect parameter set global volume
0x21 A effect parameter set speed
0x22 D effect parameter volume slide
0x23 E effect parameter portamento down
0x26 Q effect parameter retrigger note
0x2b S effect parameter set high offset

Example (in hexa): 13 01 38 02 0d 0f 20

13 : 1 row of 3 type/value
01 38 : note is G#5
02 0d : instrument is 14
0f 20 : effect : C20

Data for one voice is terminated with 00

2.3.3 Instruments data

Instrument data takes 0x33a bytes, after the instrument name (0 terminated)

Address Length Type Description
? char[] instrument name (0 terminated)
0x0000 1 long flags : 1 = play on MIDI, 2 = mute. These are hardly ever used
0x0004 10*12*4 byte sample map : 10 octaves * 12 notes * 4 bytes (1 byte, 1 byte, 1 short = sample number)
0x01e4 1 byte volume enveloppe : flags
0x01e5 1 byte volume enveloppe : number of node points
0x01e6 1 byte volume enveloppe : loop beginning
0x01e7 1 byte volume enveloppe : loop end
0x01e8 1 byte volume enveloppe : sustain loop beginning
0x01e9 1 byte volume enveloppe : sustain loop end
0x01ea 25*2 short volume enveloppe, 25 nodes : position (short), value 0->64 (short)
0x024e 1 byte panning enveloppe : flags
0x024f 1 byte panning enveloppe : number of node points
0x0250 1 byte panning enveloppe : loop beginning
0x0251 1 byte panning enveloppe : loop end
0x0252 1 byte panning enveloppe : sustain loop beginning
0x0253 1 byte panning enveloppe : sustain loop end
0x0254 25*2 short panning enveloppe, 25 nodes : position (short), value +32/-32(short)
0x02b8 1 byte pitch enveloppe : flags
0x02b9 1 byte pitch enveloppe : number of node points
0x02ba 1 byte pitch enveloppe : loop beginning
0x02bb 1 byte pitch enveloppe : loop end
0x02bc 1 byte pitch enveloppe : sustain loop beginning
0x02bd 1 byte pitch enveloppe : sustain loop end
0x02be 25*2 short pitch enveloppe, 25 nodes : position (short), value +32/-32 (short)
0x0322 1 byte vibrato type (0=sine, 1=Ramp down, 2=square, 3=random)
0x0323 1 byte vibrato sweep
0x0324 1 byte vibrato depth
0x0325 1 byte vibrato rate
0x0326 1 short fade out
0x0328 1 byte midi channel
0x0329 1 byte midi bank
0x032a 1 byte midi patch
0x032b 1 byte midi bend
0x032c 1 byte global volume *2
0x032d 1 short panning
0x032f 1 byte New Note Action [IT]
0x0330 1 byte Pitch Pan Separation [IT]
0x0331 1 byte Pitch Pan Center [IT]
0x0332 1 byte Duplicate Check Type [IT]
0x0333 1 byte Duplicate Check Action [IT]
0x0334 1 short Randon Volume variation (%) [IT]
0x0336 1 short Randon Panning variation [IT]
0x0338 1 byte Initial Filter Cutoff [IT]
0x0339 1 byte Initial Filter Resonance [IT]

The number of sample for given instrument is deducted from the sample map.

2.3.4 Samples data

Samples data takes 0x29 bytes after the sample name, which is 0 terminated.

Address Length Type Description
? char[] sample name (0 terminated)
? char[] sample filename (0 terminated)
Update: with v2.4 encoder, Johannes Schultz pointed to me "this 0-byte comes from the fact that v2.4 can also store sample filenames right after sample names, so a double-zero simply means that the sample filename is empty. "
0x0000 1 long finetune (0x00 in file = -128, 0x80 = 0, 0x76 = -10, 0x90 = 16) [MOD,MTM, XM]
or "C4/5 speed" for S3M and IT (with Amiga slides), unless linear bit is also set.
0x0004 1 byte transpose
0x0005 1 byte volume (max 64)
0x0006 1 short panning
0x0008 1 long size (in bytes for 8bits, in short for 16bits).
if size==0 and end!=0 : means removed sample (not used)
0x000c 1 long start
0x0010 1 long end
0x0014 1 short flags
bit #0 (0x0001): 1=16bits, 0=8bits
bit #4 (0x0010): 1=loop
bit #5 (0x0030): 1=bi-loop (both set with bit#4) [IT]
bit #12 (0x1000): 1=lossy compression (1=mp3 0x1000, together with bit#13 (0x3000) means ogg)
bit #13 (0x2000) : lossless compression 'delta' (with bit#12 cleared)
bit #14 (0x4000) : lossless compression 'delta prediction'
0x0000 means "not compressed" if size!=0
0x0016 1 byte vibrato type (0=sine, 1=ramp down, 2=square, 3=random) [IT]
0x0017 1 byte vibrato sweep [IT]
0x0018 1 byte vibrato depth [IT]
0x0019 1 byte vibrato rate [IT]
0x001a 1 byte global volume [IT]
0x001b 1 long sustain loop start [IT]
0x001f 1 long sustain loop end [IT]
0x0023 1 long compressed size (lossless, mp3 or ogg) 
0x0027 1 short encoder delay

3. Samples

The samples lossless algorithms are here (mo3_unpack.c).
There is one version for 8 bits sample and another one for 16 bits samples.
There is 2 kind of algorithm : 1 based on delta encoding (as in xm), a second one based on delta prediction encoding. Then these delta values are stored in a compact way.

4. Code source

The source code version 0.6 is available here. Here is an extract of the 'readme' file:

"The piece of code has been written as a compagnion (validation code) of the document "the unofficial MO3 specification".

It is targeted to developpers or technical people, not for end users. It can be used by IT/XM/S3M modules specialists (tracker editor developper or modules players) to write a MO3 import loader, or more generally to handle MO3 modules in any way.

The MO3 format has been created by Ian Luck (http://www.un4seen.com). If you are looking for a good encoder and decoder (but without the source code) and a good module player, Ian's web site is the right place to go.

The features of unmo3 (opensource version) are:

"

You can see output of the 'demo' here and the auto tests here. For tests, a second archive 'unmo3_test.zip' is required (has to be uncompressed in the same place as the 'unmo3_src.zip' archive).

The Win32 binary is here unmo3.exe.

5. Other information

unmo3 Win32 executable is compressed with PEtite, and Linux executable with UPX.

Other modules file formats are available here (from Wotsit):

Previous work on Amiga module compression is available here (Sylvain Chipaux and Gryzor).

You can find on Exotica a huge collection of Amiga music formats descriptions.

State of the art waveform compression is explained here : TTA, Shorten, AudioPak, and FLAC.

  • DUMB is a free opensource library to replay XM, IT, MOD and S3M modules.
  • XMP is a portable and opensource module player.

    -end of the document-