We have seen that there are a number of aspects of sound that we use to determine its direction, so this should tell us what we need to reproduce to enable a listener to to get the same experience - right?
Sorry - it's not as simple as that. Let us start by considering two idealised situations:
Neither of these gives us a real solution, so what can we do?
The technique of recording the sound entering the ears, and playing it back, is approximated by "binaural" techniques. The recording may use a dummy head, or even torso, of varying detail, or may use the recordist's own head. The recordist may move while recording, but dummy heads are generally kept stationary. The listener uses headphones.
Remarkable realism can be achieved in this way; but it is limited by the restriction on head movement, the enclosing effect of headphones, and the limitations of the differences between the tonal effects generated by the recording setup and the listener's own body. The lucky listener will get a good match and impressive realism, but others - probably the majority - will fail to get a really good impression.
Ambisonics is a technique that describes the three-dimensional sound field at a point, and defines the means to reproduce that sound field at a point within a matrix of loudspeakers. For my purposes, the source of the description of the sound field is a "soundfield microphone", but others may wish to emphasise that synthetic sounds can also be generated in this format.
The problem with this technique is that it is only defined at a point, which is not sufficient for a listener to gain a fully convincing experience. However, the more detailed the description of the sound field at the single point, the more closely the reproduction at surrounding points will approximate the correct value. In real life, the simplest ambisonic technique ("first order") can be remarkably effective, but as with binaural techniques, not all will agree. Higher order versions which use more channels are harder to work with, but they can give a larger "sweet spot"; increasing computer power is enabling them to approach practicality.
Some work has also been done using an ambisonic source to generate binaural signals for headphones. As a refinement, head-tracking has been used, in which the decode from B-format to binaural is rotated appropriately, in real time, as the head is turned.
In real life, the practicalities of introducing loudspeakers into domestic spaces, and the number of channels of audio signal that have been practical to handle, have together determined the dominant means of reproduction in general use. So recording techniques have had to be constructed to make the best use of the available deployed technology.