Recording Direction

What to do?

We have seen that there are a number of aspects of sound that we use to determine its direction, so this should tell us what we need to reproduce to enable a listener to to get the same experience - right?

Sorry - it's not as simple as that. Let us start by considering two idealised situations:

We record the sound entering each ear canal, and play it back into the ear canals of the listener. At first this might seem ideal; but it has some obvious flaws:
- There is no way to introduce the effect of head movement.
- It has been found that pinna effects are quite strongly individualised, as we all have slightly different pinnae and have learnt the effects of our own.
We record the "sound field" within an area sufficient to enclose the head, and then reproduce this around the head of the listener - head movement is now allowed (within the defined volume) and timing, volume and tonal effects should be correctly generated by the listener's own head. However, this again is not practical:
- Collecting sufficient information about the sound field within a finite volume (as opposed to at a point) is not physically practical.
- We have no means to reproduce the sound field.
- We have not taken account of the interference to the sound field by the listener's head.
- We have not taken account of the ambience of the reproducing venue, which will be part of what is heard even though it is not part of what we wish to reproduce.

Neither of these gives us a real solution, so what can we do?

Approaching the ideal - Binaural

The technique of recording the sound entering the ears, and playing it back, is approximated by "binaural" techniques. The recording may use a dummy head, or even torso, of varying detail, or may use the recordist's own head. The recordist may move while recording, but dummy heads are generally kept stationary. The listener uses headphones.

Remarkable realism can be achieved in this way; but it is limited by the restriction on head movement, the enclosing effect of headphones, and the limitations of the differences between the tonal effects generated by the recording setup and the listener's own body. The lucky listener will get a good match and impressive realism, but others - probably the majority - will fail to get a really good impression.

Approaching the ideal - Ambisonics

Ambisonics is a technique that describes the three-dimensional sound field at a point, and defines the means to reproduce that sound field at a point within a matrix of loudspeakers. For my purposes, the source of the description of the sound field is a "soundfield microphone", but others may wish to emphasise that synthetic sounds can also be generated in this format.

The problem with this technique is that it is only defined at a point, which is not sufficient for a listener to gain a fully convincing experience. However, the more detailed the description of the sound field at the single point, the more closely the reproduction at surrounding points will approximate the correct value. In real life, the simplest ambisonic technique ("first order") can be remarkably effective, but as with binaural techniques, not all will agree. Higher order versions which use more channels are harder to work with, but they can give a larger "sweet spot"; increasing computer power is enabling them to approach practicality.

Some work has also been done using an ambisonic source to generate binaural signals for headphones. As a refinement, head-tracking has been used, in which the decode from B-format to binaural is rotated appropriately, in real time, as the head is turned.

Compromise - Mono, Stereo and Surround

In real life, the practicalities of introducing loudspeakers into domestic spaces, and the number of channels of audio signal that have been practical to handle, have together determined the dominant means of reproduction in general use. So recording techniques have had to be constructed to make the best use of the available deployed technology.

Mono is the use of a single channel, generally, though not always, reproduced from a single loudspeaker. Because the ear is missing useful cues relating to direction, the sound can be confused, and so techniques of mixing the sound from different microphones were developed to help generate improve clarity.
Stereo is the use of two channels, reproduced from two spaced speakers in front of the listener. Many different microphone arrangements are used, often trying to record cues that will be confused by reproduction over speakers which can be heard by both ears. Because stereo is still dominant, some stereo recording techniques are described in more detail.
Surround is the use of more channels (typically six) to add generalised ambience to a stereo signal. Only rarely is there any attempt to reproduce a true ambience rather than merely hinting at it, and there is no attempt to go outside the horizontal plane. Surround techniques are at present dominated by the supposed requirements of film theatre rather than sound reproduction for its own sake; in case this seems dismissive, it is worth remembering that this was also the motivation for Blumlein's early experiments in stereo.