Discusses design of Ambisonic decoders. Speaker positions which should be used via dwChannelMask which is sent to a soundcard to tell it which channels go to which speakers.
Ambisonic Surround Decoder - Richard Lee 20feb07
This page for designers of the ASD, assumes a certain faciltiy with Ambisonics, electronics, programming & DSP. The Ambisonic Surround Decoder is a vital component of our Ambisonic Strategy to promote True Ambisonic Surround sound with the Home Theatre public.
Some issues affecting ease of implementation and more importantly, how to ensure the novice trying True Ambisonic Surround for the first time gets the best results. Comments on performance apply mainly to 1st order systems. However, the implementation issues apply to 2nd and 3rd order systems too.
Also to generating Nimbus 4.0 from B Format. The final ASD could have an option to generate Nimbus 4.0 files.
It is NOT the programmer's page. A dinosaur C man, I'm incompetent to pontificate over this GUI stuff.
Etienne has registered a sourceforge project at
http://sourceforge.net/projects/ambiprocessor/ and sees the project in
[However, no activity has been recorded in this project.]
To take its full role in the Ambisonic Strategy, the Ambisonic Surround Decoder needs to be a supa dupa Dolby / DTS film player too. This is best achieved by making it part of a popular multi-media DVD player such as VideoLAN.
More than 20 yrs ago, a performance standard for Ambisonic Decoders was established by the Ambisonic team of Fellgett, Barton & Gerzon, which even today, is probably unmatched for its main purpose; the reproduction of music in the home for one listener.
In the intervening years, Ambisonics has been used and developed for many other purposes, especially for large audiences (see ref [7] Malham). Modifications mainly to do with shelf filters optimise these applications. Richard Furse's popular Ambiplayer family of decoders, favour "controlled opposite" feeds without Shelf filters. He reports
I find the `controlled opposite' equations produce a larger listening area at the expense of some directional information.
Some modern DAW tools like the Waves convolver, CoolEdit Pro (now Adobe Audition) and even Nero 7 use very similar techniques in their panning equations and have a very homogeneous 'Ambisonic sound' to their output. They may pan with the Ambisonic "Controlled Opposites" model using all 'speakers' instead of crude pairwise panning which draws attention to the speakers.
For classic Ambisonic systems with Shelf filters, Bruce Wiggins reports
Four speakers arranged in a square works well, but the sweet area is very small. The same goes for a cube of eight speakers for with-height material. Moving your head even slightly off centre can produce some very 'phasy' effects and the decode does not degrade gracefully as is normally the case with Ambisonics. A realistic minimum of 6 is better for this reason.
My experience is
This effect is only for the centre listener who is moving his head. If you move a bit further away eg the 2 seats to the side or the 3 seats behind centre, the effect disappears so I would say we have "graceful degradation" again. Some of this is simply comb filtering off the exact centre. Angelo Farina convolves a different random phase shift with each speaker at HF to alleviate this. Predictive solutions, eg Hilbert transforms may also work. Six speakers are much less of a problem.
(The Wiggins regular decoders of that period, used minimum phase Shelf filters which may not be optimum. see SHELF FILTERS for Ambisonic Decoders)
But there is still a place for old style Ambisonic Decode with Shelf filters especially in a domestic environment with just 4 speakers. Shelf filters help distinguish Ambisonic systems from the 'new' panning solutions described above. They allow optimisation of both HF & LF and are especially beneficial in small spaces. Without shelf Filters, Dr Geoff Barton reports
... you end up with something which is a bit 'quad' like. The shelf filters help to make the speakers less audible as direct sources.
Classic Ambisonic systems also do other things better than dedicated systems. eg Bill Sommerweck reports
If you want to hear just how good Ambisonic playback can be with theatrical soundtracks, try the "campfire" scene from "Temple of Doom" with both a Pro Logic decoder and a UHJ decoder. I did this about 20 years ago (1985?) when I was reviewing surround processors for Stereophile. The Dolby decoder was the Shure HTS-5300, which had outstanding decoding action and highly transparent sound.
With Dolby decoding, you hear (more or less) four speakers. With UHJ decoding, the sound is spread all around you -- even outside the front speakers -- without any sense of speakers. You're immersed in the sound -- without losing imaging specificity -- in a way you aren't with "discrete" systems (at least for movie soundtracks).
[Rolv-Karsten Ronningstad's page http://www.geocities.com/ambinutter/UHJ_and_Ambisonic_equations.html had info on how to use a UHJ decoder to play Dolby Surround matrix recordings, but is no longer available on the Internet.]
The ASD must provide at least this level of performance and take advantage of developments like 2nd & 3rd order encoding and the DSP power available on even modest computer systems today.
Classic Ambisonic Decoders are well described in references [1] to [4] which have necessary and sufficient information for design. They cover the following speaker layouts
These 20+ yr. old designs are still the best for any given number of speakers. Easiest to design.
Listener in the centre, equidistant from all speakers. 21st century DSP allows 'equidistant' to be relaxed at the price of much greater complexity but the angular positions of the speakers must be maintained.
Apart from the Rectangle, the common Home Theatre buff is unlikely to have a layout even remotely resembling any of these. Shelf Filters for best results.
The others are Audiophile layouts for the enthusiast who is Lord and Master of his women & chattels. Let's do these properly as they represent "State of the Art". Instructions to the ASD should encourage these.
Practical Periphonic (with height) systems (and also 2nd and 3rd order) are easiest to implement with these.
HEIGHT
Periphonic systems in the foreseeable future are likely to be Diametric Opposite. These include cuboid, bi-rectangular, double hexagon & octagons. Periphonic systems are hungry for system resources. eg the simplest, cuboid or bi-rectangular, use 8 speakers. But being Diametric Opposite layouts, we can use the system I described for a SIMPLE REGULAR HEXAGON in oct05 to derive the Diametrically Opposite speaker feed.
There are two useful methods. For a hexagon :
Method 2, a using a single 7 channel soundcard for double hexagons would be easier and more efficient than getting two 6 channel soundcards to work. A 5.1 (6 channel) soundcard could do a double pentagon Periphonic decoder but the pentagons would point in opposite directions for Diametric Opposition.
Method 1 is described in the Integrex decoder article.
WAVE_FORMAT_EXTENSIBLE is a Microsoft definition for multi-channel soundfiles and soundstreams. A confusing Microsoft document, mulchaud.htm/pdf which may still be at http://www.microsoft.com/whdc/device/audio/default.mspx purports to describe this.
A much better description is by McGill University at http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/
Windoz programs from W98SE and later are supposed to use this but many are still unaware of it and use the old CANONICAL WAV mentioned in SHELF FILTERS for Ambisonic Decoders or worse, some bastard hybrid.
Our own Richard Dobson of the Composers Desktop Project was instrumental in defining WAVE_X for short and has used it to define our Ambisonic *.AMB format.
MP4 *.AAC may have similar multichannel support in Apple Quicktime from v7.0
This bla is cos WAVE_X defines a large number of formal speaker positions using dwChannelMask. This will be the Windoz standard for speaker / channel assignment and other OS will probably follow suit.
The importance of dwChannelMask is that it is used in audio datastreams as well as file formats. Your computer sends it to your soundcard to tell it which channel goes to which speaker.
|
are the WAVEFORMATEXTENSIBLE bit masks for speaker positions. This mask is sent with a DirectSound stream to a DirectSound multichannel device. I include a little sketch as I have problems looking at the above table.
You'll see that hex & octagon in both orientations are supported and also many periphonic arrangements though only 1st order Z. Only FRONT_LEFT_OF_CENTER, FRONT_RIGHT_OF_CENTER, SIDE_LEFT & SIDE_RIGHT haven't got corresponding TOP speakers.
We should support these positions as Ambisonics will only conquer the world when we play films better than native Zillion.1 decoders. Windoz will use these speaker/channel assignments and it is likely that other OS will follow suit.
If you don't use dwChannelMask and assign speakers eg in anti-clockwise order, this may be OK for professional soundcards with sockets marked with Channel 1 , 2 .. etc.
But my Audigy 2NX has little diagrams for LF & RF, LB & RB, CF & LFE, and Sides !
So I know the your Supa Octagon Decoder is working but haven't a clue what's coming from what !
There is a logical (?) progression on most domestic soundcards.
LF & RF usually on stereo 3.5mm jacks
LB & RB
CF & LFE these do 5.1
L & R sides these upgrade 5.1 to 7.1
Hence my plea for walkaround test signals in sursound.
If you use the proper Channel Masks, you only have to tell the punter the LFE channel is CB for Octagon 1 & Hex 1
Non-Ambisonic formats
There's a gotcha when dealing with the Dolby EX and DTS EX standards. For these 7.1 formats, surround L & R are assigned to SIDE_LEFT & SIDE_RIGHT and the extra channels are assigned BACK_LEFT & BACK_RIGHT. dwChannelMask = 3F 06 00 00 expressed as a "string"
But their normal 5.1 formats assign surround L & R to BACK_LEFT & BACK_RIGHT. dwChannelMask = 3F 00 00 00
More reason for walkaround test signals in sursound.
Faulty Microsoft WAVE_X demo files
The McGill site also has a number of WAVE_X test files (from Microsoft) which are faulty. Don't expect your soundcard and software to play these files properly.eg 8_channel_ID.wav might play with LF RF C LB RB all mixed to stereo
This has dwChannelMask = 3F 00 00 00 which is same as ITU 5.1 Many soundcards will not handle this properly cos a 5.1 mask with 8 channels is undefined.
Setting dwChannelMask = 3F 06 00 00 corrects it to a common 7.1 which will play on most 7.1 soundcards
Complain to Microsoft
I always thought Ambisonics would have to ride on the back of Home Theatre as we have trouble persuading the wife to more speakers. Still true but things have improved over the last 20 yrs. Ray Dolby & Hollywood have persuaded her that 5 speakers are a good thing. The next stage is getting the speakers in sensible positions. Eric Benjamin, (our secret insider at Dolby Labs) and Professore Angelo Farina of U of Parma both report on Real World Systems ...
What is really the standard, however, is a rectangle, the video screen at the centre of one of the long sides, the listeners at the centre of the opposite long side (with a reflective wall just behind their heads), 4 loudspeakers at the corners (so the "surround" loudspeakers are actually more or less at the sides of the listeners). And the center channel, of course, is below or above the screen...
Fig 1 : 5.1 Systems shows the Vienna decoders (ref [6]) to be essentially Rectangular layouts with a CF speaker. Little resemblance to Real World Systems. The Regular Pentagon is also unlikely with the listener too far forward. Only the ITU 5.1 layout with speakers at +-30 and +-110 is anywhere near a Real World System. http://dolby.com/consumer/home_entertainment/roomlayout.html has four recommended layouts some of which are replicated in "Home Theatre for Dummies" Briere & Hurley. The 5.1 layout is similar to the Real World Systems that Angelo & Eric report, with one important difference. The rear speakers are not equidistant (wrt to LF RF) from the listening position for Real World Systems but much closer. LF RF RB LB are nearly square with the listener close to the back boundary. Only for Dolby's corner layout are LB RB anywhere near equidistant. But for most 5.1 systems with LB RB at 110, they are likely to be much too close, too loud and too soon. |
If the speakers are in a "square" (1.05 x 1.00 wide) with LF RF at +-30 and LB RB at +-110 (probably the nearest to a universal arrangement), the rears are 5.48dB too loud (assuming small speakers obeying inverse square law) and arrive in 0.532 of the time. And because the distances are so different, we MUST use different and exact LF filters for "Distance Compensation" for front & rear.
We need to tell the ASD where the speakers are if we expect 5.1 arrangements to give at least AS GOOD results as 4.0 with the listener in the centre. If we don't do this, we are better off sticking with 4.0 and telling the punter to sit in the centre of a square.
The centre listener 4.0 layout can get away with an average "Distance Compensation". As this is the same for all speakers, the error if this distance is wrong is small.
On the "ITU" layout, we can't do this as the two filters are nearly an octave out. And while we could do a general 5.5dB level adjust, we can't do this for the delay.
The computational cost of allowing the listener is to sit near the boundary is considerable but justified on the ASD as this WILL ask the punter where the speakers are.
But what all this bla also says is that for G Format ITU 5.0 to give AS GOOD results as Nimbus 4.0, it must either specify or assume LOTs about the final layout. Not only speaker angles but specific distances. And to recover B format from G Format 5.0, you need to take all these into account.
Bruce Wiggins gives one of his Tabu decodes (ref [8]) in his WAD Ambisonic plugin for a ITU 4.0 layout (without CF) speakers equidistant at +-30 & +-110 Available from http://sparg.derby.ac.uk/SPARG/Staff_BW.asp
The coefficients are also in http://www.e-wig.co.uk/sursound/gformat_ac3.jpg and here.
| W | X | Y |
L | 0.5018 | 0.6218 | 0.4406 |
R | 0.5018 | 0.6218 | -.4406 |
SL | 0.8392 | -.3692 | 0.5757 |
SR | 0.8392 | -.3692 | -.5757 |
SL' | 0.4465 | -.1964 | 0.3063 |
SR' | 0.4465 | -.1964 | -.3063 |
SL' & SR' are the new coefficients assuming the layout in Fig 1 Real 5.1 30 & 110 with speakers in a square but the listener near the back.
In addition, if the square is 4m, LF & RF are 4m away but SL' & SR' are 1.872m closer and need to be delayed by 5.4ms.
"Distance Compensation" (ref [3]) should be set to 13.7Hz for L R and 25.7Hz for SL' SR'. Important cos the speaker distances are so different.
The delay and the "Distance Compensation" will change with the size of the square.
It's also much more computationally intensive to have Shelf Filters in an ITU 5.0 or 4.0 layout as instead of shelving just XY & Z, they would have to be twiddled for each speaker pair.
Although Bruce is probably the world expert on ITU 5.1 Ambisonic systems, he avoids using CF. Most CF speakers are very different from L R & SL SR, at a different height and draw attention to themselves.
6.1 & 7.1 systems are Audiophile layouts in the same category as Regular Hex and Octal arrangements. If Ray & Hollywood have another 2 decades, they might make this domestically acceptable ... but I doubt it. For the average Home Theatre buff, rear speakers are unlikely to be in the correct place.
LB RB at +-90 are probably more common than +-110 but even Bruce Wiggins' Tabu optimisation would have problems with that. His PhD is probably "State of the Art" for irregular speaker layouts though it raises more questions than it answers.
These optimise the ASD to give the best results at both Low Frequencies where "phase" is thought to be important as well as "High" Frequencies where "amplitude" is thought important. This is done by changing the ratio of W to the Velocity signals.
Detailed "HOW TO for DSP gurus" in SHELF FILTERS for Ambisonic Decoders.
The ratios are in "Practical Periphony" - M Gerzon AES preprint 1571, 1980
http://www.geocities.com/ambinutter/UHJ_and_Ambisonic_equations.html give a slightly different set with 0.45dB more HF. [Site no longer available]
Shelf filtering should be encoded into Nimbus 4.0 material to give the best results on non Ambisonic systems. B Format material should be played at home on the ASD with Shelf filters "ON".
Ambisonic Shelf Filters have differing Amplitude for W and Velocity but Phase should match. All hardware decoders, including the Wireless World 1977 Integrex and the Geoff Barton designs
eg. Minim and A+D as well as research prototypes used by the Ambisonics project in its heyday, used first order shelves centred at 380Hz, but had first order all-pass phase response.
380Hz Shelf Filters have been tested. In theory, moving up to 700Hz would improve localisation for central listeners at the expense of a more critical sweet spot.
Today with 21st century digits we have slightly more options.
1) We can make XYZ minimum phase. The advantages are
BUT W would have to be non-minimum phase and non-causal to get phase to match but a different amplitude response.
Hence a FIR and delay for XYZ to match.
2) We can match the old 380Hz all-pass networks with digital IIR equivalents.
3) 'Linear Phase' filters are favoured by Eric Benjamin, our insider at Dolby Labs.
Easiest to design but as FIRs, the computational cost is high
So for the least computing, Type 2) wins. But ..
INVERTIBILITY of Shelf Filters in Nimbus 4.0 recordings
This is mostly irrelevant but may still be interesting to students of the arcane arts. It may be needed for the ASD to play BHJ derived Nimbus 4.0 on non-regular, non-rectangular layouts but otherwise doesn't come into the AMBISONIC STRATEGY at all.
Type 1) Shelf filters
Type 2) Would need FIRs to invert everything as the inverse filters are non-causal. An FIR here is NOT exact as Fons points out but see later
If invertibility is a concern, then Type 1) would be 'exact' and use less computational power in the end. (we can use FIR for Type 2 and get 'exact' invertibility with more FIRs but the price in computing power is high)
I should point out that Bob Stuart's comments about 'lossless reconstruction' applies to ALL DSP operations. Any non-trivial operation adds noise (and distortion if not done properly) and reduces resolution. A level change or addition is non-trivial in this context. Even IIR minimum-phase shelf & invert will add noise.
Additional errors apply to FIRs. A real response has an Infinite Impulse Response and a FIR is only an approximation to it.
But a 1024 pt FIR can get very close though you have to know what you are doing. If you use naive methods of inverting a response you get windowing effects and these are AUDIBLE. The "non est tantum facile" part of dreaming up good FIRs is tweaking it so the errors are somewhere you can't see or won't hear for the application. You have to do this even with 1024 pt FIRs for eg loudspeaker EQ. And FIRs have to be explicitly dithered to sound good.
Why not use an IIR? But there are many functions which IIRs can't do. One of them is non-causal responses and hence we have to use FIRs for at least one channel of the Type 1) filter and all of the Type 2) inverters.
There are also hybrid FIR/IIR filters which stick a long FIR onto the coefficients of a simple IIR and these have advantages and disadvantages of both types.
The filter sets here are meant for 'exact' recovery. I would not expect you to see any remaining phase shift with 1024 pt. And the Type 2) inverting filters will invert the present analogue all-pass shelf filters.
Does the phase shift in the IIR shelf filters come anywhere close to causing audible artifacts?
See "Is Linear Phase Worthwhile" - Lee, AES preprint 1732 Hamburg (1981) and a John V & Stan L paper around the same time "Audibility of Midrange Phase" (I think) As with everything of John & Stan's, once they have dealt with a subject, there is nothing left to say.
Large audiences : Dave Malham (ref [7] and M Gerzon) suggest it is more sensible not to Shelf and just use an Energy optimisation which may be further modified to ensure no out of phase signals from any speakers. eg Richard Furse's "Controlled Opposite" feeds. Bear in mind this only works if you have enough speakers (roughly) evenly distributed round the audience. With only 5 speakers in ITU 5.1, "Controlled Opposite" gives VERY poor results for everyone
BHJ derived Nimbus 4.0 will have different Shelf filters (http://www.geocities.com/ambinutter/UHJ_and_Ambisonic_equations.html [link no longer available]) and also "Forward Preference" ref [3] Unfortunately "Forward Preference" depends on speaker layout so BHJ derived Nimbus 4.0 format will not be optimal if used on a non-diametric opposite / non-regular layout. It is difficult to compensate for this so the best approach may be not to try.
Simply use the derived Shelf filtered pseudo B format signals as they are to feed the new speaker matrix.
Decoding Nimbus 4.0 In fact this might be the best strategy for the ASD to decode Nimbus 4.0 whatever the origin, whether proper B Format or a BHJ master. These should always have appropriate Shelf filters as the criteria is the best results in the home with minimum hassle.
The only casualty would be the user wanting to use the material for a large audience. eg playing a G-format piece as part of the sound effects for a theatre production. For this, Type 2) inversion would be exact for proper B Format and probably close enough for BHJ originals.
The first four are sufficient for designing traditional regular polygon and rectangular speaker array decoders. Read the others for ITU 5.1 and other irregular decoders.
Suggests surround distribution is via DVD-V (AC3 & DTS), DVD-A (as DVD-V plus PCM and MLP) and SACD for disc. WMA, AAC & RealAudio for net files.
[N.B. This project has not taken off. Could you be the one to make it happen?]
4 parts of the Sourceforge Ambiprocessor project Please contact Etienne if you can contribute. See also http://www.ambisonicbootlegs.net/Members/etienne/AmbisonicPlayerSpecification/
The API and API implemention will be bundled as 1 library.... 'libambisonicprocessor.*'
The first 3 bits can be easily incorporated into an existing cross-platform media player like VideoLAN.
Ambisonic Strategy, how we promote True Ambisonic Surround, has a general list of ASD requirements.
Nimbus 4.0 is the format for distributing True Ambisonic Surround recordings to Home Theatre buffs.
http://dream.cs.bath.ac.uk/researchdev/wave-ex/bformat.html describes .AMB our future proof Ambisonic format. Presently an extension of Microsoft's WAVE_X format. Possible FLAC (lossless) compression in future. Ogg want to incorporate B format in their PCM and Vorbis formats but don't have any Ambisonic gurus. Please contact Oliver Oli if you can help.
http://www.ambisonicbootlegs.net/Members/etienne/ambisonic-software [link no longer available] Etienne's page of software Ambisonic Decoders.
Jan Jacob Hofmann attempts to keep an up to date list of Software Decoders on his web site. If you find something outdated or missing, please contact him. He also composes in 2nd order B format and plays it back on the JJ.
Permanent Ambisonic Systems maintained by members of sursound; the Surround Sound forum which has been running since 1995.