Channel Formats

An ambisonic recording may be represented in a number of different ways by a set of signals. In general, these sets of signals can be transformed from one to another, theoretically without loss - though some of the transformations are easier to achieve accurately than others.

The discussion on this page is mainly about the different ways of handling first-order ambisonic signals, though higher-order signals do get mentioned.

A-format

In the beginning was the microphone... In this case, anyway, because A-format is the term used for the signals from the four capsules of a tetrahedral soundfield microphone. Because the characteristics of these capsules may vary between different microphone designs, the exact meaning of the A-format signals is not fixed. Each microphone system has a procedure, whether implemented in hardware or software, for converting from A-format to B-format for further processing.

Although A-format is generally taken to mean the signal set from a tetrahedral microphone, I believe that the intention was that it simply means the physical signals before conversion to B-format, and so also encompasses the signals from a higher-order microphone.

Some researchers have found that certain mathematical tasks are better handled in A-format than in B-format - reverberation has been tried this way, for instance. In this case, a standardised A-format signal set is generated from B-format, rather than being based on microphone signals that need correction in their processing.

B-format

The basic format that is used for the storage and manipulation of ambisonics is B-format. This format consists essentially of the spherical harmonics of the sound field up to the order being considered. For first-order ambisonics there is one signal of 0th order known as W, and three of 1st order known as X, Y, and Z. It happens that these signals correspond conceptually with the outputs of one omnidirectional microphone and three orthogonal figure-of-eight microphones placed at the same point.

This set of signals allows the manipulation required to generate speaker signals, to rotate the sound field, and various other transformations, to be performed with the simplest possible mathematics, and so it is also the set of signals that is used for both storage and transmission of ambisonic material. (However, note that some researchers have found a standardised A-format to be easier to use for certain purposes - e.g. reverberation.)

Higher-order ambisonics uses additional spherical harmonics. For second and third orders, these have been given additional letter names, but for general consideration of higher orders, other more flexible naming schemes are required - the choice of scheme to be used, and even the order in which the higher-order signals are listed and recorded, is still a matter of debate in the ambisonics community. There is also some dispute over whether the term B-format should be limited to first-order signal sets.

The other complication in B-format is the matter of scaling. The definition of W is that it is the 0th order spherical harmonic divided by the square root of two. This was originally done for the practical reason that in a typical sound field, W would have a higher level than X, Y and Z, and so the signal-to-noise ratios for those signals would be reduced by typically 3dB. In the days of tape recording, this was considered a serious matter. However, it has created, and continues to create, much confusion - not least because the equations used for various transformations have been presented in different contexts both with and without the scaling factor - in other words, in terms of spherical harmonics, or in terms of B-format. The application of scaling factors to the signals of higher orders is also a matter of debate, with at least three different schemes being considered; unfortunately, other disciplines which use spherical harmonics have not chosen to use the same scheme, so it is not possible to appeal to standard practice.

C-format

The consumer distribution format. In the early days of ambisonics, it was not practical to consider distributing the 4-channel B-format signals. So an alternative representation of those signals was designed, called UHJ, which provided two channels (L, R) which were stereo compatible, and whose sum was a well-behaved mono signal. L and R were so designed that full horizontal surround could be generated from them, though at reduced resolution. A third signal, T, could be combined with L and R to regenerate the original W, X and Y signals, which a fourth, Q, carried height information from the original Z.

Proposals were made for transmitting various combinations of this signal set (L, R, T, Q) on FM radio, but they were not taken up by the industry, and so as a complete signal set, UHJ died. However, the use of L and R to carry a surround signal (universally referred to as UHJ Stereo, although it originally had the designation BHJ) was taken up enthusiastically by the Nimbus record company, whose entire catalogue (with trivial exceptions) is recorded in ambisonically and released in UHJ Stereo. Smaller quantities of such recordings were also released by a number of other record companies.

D-format

When an ambisonic signal set has been decoded into feeds for a particular array of speakers, this signal set is called D-format. However, the term is little used (see G-format, below).

E-format

The first part of decoding from C-format (UHJ) may be to recover something like B-format. However (considering the horizontal case), if the T signal is absent, or not of full bandwidth, the recovery is imperfect. Michael Gerzon referred to the recovered signals as E-format; I don't know if anyone else ever does so.

G-format

Once it became practical to transmit four channel or more, the obvious way to distribute ambisonic recordings is to use B-format. However, virtually no one has the necessary decoders to play such recordings (I believe that currently only Meridian sell equipment with that capability). Geoffrey Barton proposed that instead efforts should be concentrated for the time being on distributing speaker feeds, decoded for the currently popular 5.1 speaker layout. Michael teasingly referred to this as Geoffrey's format, which became G-format. In practice, the term G-format is now used in place of D-format for any set of speaker feeds.

Nimbus have decoded some of their recordings and sold them on DVD-A to be played through a square of four loudspeakers; however, the failure of the DVD-A disk format to take off has severely limited their appeal and impact.