Windows audio

Revision as of 04:59, 27 April 2013 by UNiversal (Talk)

Jump to: navigation, search

 Audio: AudioEngine HOW-TO: Configure audio Windows Settings Intel Linux Modifications for HD Audio 


1 XBMC & the Windows Audio API's

XBMC supports both the Directsound and WASAPI modes.

Since WASAPI performs no mixing or resampling this is the preferred mode for best quality audio.

In XBMC WASAPI only uses the exclusive mode of operation in order that the application gets the exclusive rights to the audio buffers whilst playing audio streams to the exclusion of all other sounds or players. When using WASAPI care must be taken to ensure Windows is configured to allow XBMC to run in exclusive mode, refer to the below Configure Windows Sound Settings section.

In addition XBMC WASAPI also uses the more modern Event driven mode of WASAPI, so both the audio hardware & audio driver need to support the Event mode.

Note: This is a change from the old audio subsystem used in Eden (Version 11) and earlier, as previously XBMC operated in the push mode.

Due to the two updates in how WASAPI works within XBMC Frodo, with it operating in Exclusive & Event only modes, then previously working configurations may not work when upgrading to XBMC Frodo, as Windows may not either be configured for Exclusive mode or the audio driver or hardware may not support Event mode.

2 Check drivers

3 Configure Windows Sound Settings

4 Windows Audio API's - Background

Since Windows Vista SP1 there has two primary audio interfaces, DirectSound and WASAPI (Windows Audio Session Application Programming Interface) with WASAPI being a replacement for Windows XP's Kernel Streaming mode.

4.1 Directsound

DirectSound acts as a program-friendly middle layer between the program and the audio driver, which in turn speaks to the audio hardware. With DirectSound, Windows controls the sample rate, channel layout and other details of the audio stream via an Audio Mixer. Every program using sound passes it's data to DirectSound and the Audio Mixer which then resamples as required so it can mix audio streams from any program together with system sounds.

The advantages are that programs don't need resampling code or other complexities, and any program can play sounds at the same time as others, or the same time as system sounds, because they are all mixed to one format.

The disadvantages are that other programs can play at the same time, and that a program's output gets mixed to whatever the system's settings are. This means the program cannot control the sampling rate, channel count, format, etc. Even more important for this thread is that you cannot pass through encoded formats, as DirectSound will not decode them and it would otherwise bit-mangle them, and there is a loss of sonic quality involved in the mixing and resampling.


Partly to allow for cleaner, uncompromised or encoded audio, and for low-latency requirements like mixing and recording, Microsoft re-vamped their Kernel Streaming mode after XP and came up with WASAPI for Vista.

WASAPI itself has two modes, Shared and Exclusive.

Shared mode is in many ways similar to DirectSound as it allows other sounds to be mixed into the currently playing stream, however this mode is not supported on XBMC so won't be covered any further here.

WASAPI Exclusive mode bypasses the Audio Mixer and thus the mixing/resampling layers of DirectSound so audio is passed-through as-is, this is why WASAPI should be used for encoded formats like DTS in order that they can reach the receiver unchanged for decoding there.

WASAPI Exclusive mode allows the application to interrogate the capabilities of the audio driver, since audio is presented directly by the application to the audio driver the format that the audio is sent in by the application must be in a format that is compatible with the capabilities of the audio driver, as there is no DirectSound between to convert it. This interrogation is a two way process that often involves some back-and-forth depending on the format specified and the device's capabilities, once a set of compatible formats is agreed upon by application and audio driver, the application then decides how it will present the audio stream to the audio driver.

In addition to Shared and Exclusive modes, there are two modes for how data is passed from the application to the audio driver.

The normal manner is in push mode - a buffer is created which the audio device draws from, and the application pushes as much data in as it can to keep that buffer full. To do this it must constantly monitor the levels in the buffer, with short "sleeps" in between to allow other threads to run.

WASAPI, and most modern sound devices, also support a "pull" or "event-driven" mode. In this mode two buffers are used. The application gives the audio driver a call-back address or function, fills one buffer and starts playback, then goes off to do other processing. It can forget about the data stream for a while. Whenever one of the two buffers is empty, the audio driver "calls you back", and gives you the address of the empty buffer. You fill this and go your way again. Between the two buffers there is a ping-pong action: one is in use and draining, the other is full and ready. As soon as the first is emptied the buffers are switched, and you are called upon to fill the empty one. So audio data is being "pulled" from the application by the audio driver, as opposed to "pushed" by the application.

Personal tools

Google Search
Wiki help