1. Introduction | 2. Objectives | 3. Basic Concepts | 4. Glossary | Related Components | Take the Quiz
Take the Bluetooth LE Audio Quiz for a chance to win a Nordic nRF5340 Audio DK (Dev Kit).
Take the Quiz Audio Streaming Quiz Related VideoAlmost everything we do in today’s wireless world involves audio. Listening to music, watching videos, dictating messages, and talking with friends and coworkers; all require audio, which is, in many cases, being transmitted through a wireless headset or speakers via Bluetooth. Additionally, many wireless medical devices require high quality audio. According to the World Health Organization, nearly 2.5 billion people worldwide are projected to have some degree of hearing loss by 2050, and at least 700 million will require hearing rehabilitation. Hearing aids can potentially see vast improvements due to advancements in wireless audio transmission technology.
Bluetooth is the de facto method of wirelessly connecting audio peripherals to devices such as laptops and mobile phones and tablets. The versions of Bluetooth prior to the introduction of Bluetooth LE are now known as Bluetooth Classic. When compared to the Bluetooth Low Energy (Bluetooth LE) standard, Bluetooth Classic can reach a higher throughput via more frequent continuous radio usage. Bluetooth LE, on the other hand, uses its radio for the shortest possible time in order to conserve power.
With the introduction of Bluetooth LE Audio, manufacturers can now take advantage of the more advanced capabilities of Bluetooth LE to create the next generation of audio devices. In this learning module, you’ll find out more about Bluetooth LE Audio, its improvements over Classic Audio, and how various applications can take advantage of its features.
2. Objectives
Upon completion of this module, you will be able to:
- Describe Bluetooth LE Audio
- Understand the new features in LE Audio
- Explain the differences between Classic Audio and LE Audio
- Discuss how the features of LE Audio can benefit various audio applications
Wireless audio devices must support two basic functions: playing audio and telephony. For playing music and other audio tracks, two channel stereo audio must be supported. During a call, the simultaneous transmission and reception of audio data must be supported. Additionally, many speakers and headsets now support voice assistant functionality, one of the primary functions of which is responding to wake words. Finally, multiple connections must be possible, as many users connect their headsets to more than one device, for example, connecting to both a computer and a cell phone.
One of the limitations in Bluetooth Classic Audio is the inability to send synchronized audio from one point to multiple points. Manufacturers have gotten around this limitation by clever engineering. For example, untethered earbuds are technically two devices; however, two connections are not supported by the Classic Audio standard. Manufacturers were able to solve this problem by transmitting to one earbud via Bluetooth, and using a proprietary connection to connect to the other earbud.
What are Bluetooth Profiles?
Bluetooth uses profiles to support different types of functionality. A profile is a set of specifications that builds on the basic Bluetooth standard to define communication protocols and the type of data being transferred. For two devices to work together, both must support the same profiles. Different tasks use different profiles; for example, connecting headphones uses a different profile than transferring files.
There are three profiles in Classic Audio:
- A2DP (Advanced Audio Distribution Profile) – handles stereo multimedia audio streaming from one device to another
- HFP (Hands-free Profile) – provides two-way audio (at a lower quality) for hands-free calls and other functions
- AVCRP (Audio/Video Remote Control Profile) – provides audio remote control functionality (play/pause and volume)
For most applications, the profiles in Classic Audio work well; however, in many cases, they need to be combined to deliver the full functionality of a device. As an example, a headset is used for both listening to music and phone calls, requiring the A2DP and HFP profiles (in addition to AVCRP if the headset has audio remote control). After install, many headsets will appear as two devices: a hands-free and a stereo headset. If a user is listening to music, the A2DP profile is used, but if a call comes in, the device has to switch to the HFP profile with its lower quality audio.
Bluetooth LE Audio is the newest generation of Bluetooth audio, providing solutions to many of the limitations that Classic Audio has. Many new features are built into LE Audio, including:
- Multi-stream audio support – allows multiple audio streams to be sent to and from audio source devices, such as a smartphone, and multiple audio sink devices, such as wireless headsets. Audio sink devices can also connect to multiple audio source devices, such as laptops and smartphones.
- Broadcast Audio (Auracast) and Audio Sharing – Broadcast Audio enables an audio source device to broadcast one or more audio streams simultaneously to an unlimited number of audio sink devices. Streams can be open or closed; closed streams require a passcode for access. Personal and location-based audio sharing is also supported.
- LC3 (Low Complexity Communications Codec) – a high quality, low power audio codec that compresses the size of audio data for wireless transmission. Compared to SBC (Subband codec, currently used on Classic Audio), LC3 is capable of better audio quality using less data.
These new features enable new functionality in many audio devices. Hearing aids, in particular, will see numerous improvements. LE Audio creates a common standard that enables hearing aids to have low power and high quality digital streaming, including the ability to receive audio streams based on location.
Analysis
Bluetooth LE Audio differs from Classic Audio in that the audio data plane is now separated from the control plane. The audio data plane gives instructions to the controller regarding specific applications, whereas control data might include commands such as answering/hanging up on a call or starting/stopping a song. The actual audio data now has its own path. Uncompressed audio that is transmitted or received is compressed or decompressed using the LC3 codec, which is brand new to LE Audio. Transmission of compressed audio data takes place over Isochronous Channels.
Figure 1: Bluetooth LE Audio architecture
Generic Audio Framework
New to LE Audio, the Generic Audio Framework (GAF) is a set of specifications that form a middleware containing features that are common to many audio applications. Many audio devices can implement full functionality with just the definitions in the GAF. The individual specifications define a base level of interoperability, allowing two Bluetooth LE audio devices to transfer audio between each other. Top level specifications build on the specifications in the GAF and add additional features for specific audio applications.
The Basic Audio Profile (BAP) is used to manage unicast and broadcast audio streams. The BAP is typically implemented on the audio source side, and works with three services:
- PACS (Published Audio Capabilities Service) – determines the capabilities of a device. This service is used to declare supported audio configurations and audio context; for example, determining if the audio is a ringtone, music, or conversation.
- ASCS (Audio Stream Control Service) – manages information at audio stream endpoints, such as determining if the device is a sink or a source. For example, an earbud is in general a sink, but earbuds with microphones would also be considered sources. Additionally, the ASCS determines what state the device is in (whether it is streaming or not).
- BASS (Broadcast Audio Scan Service) – manages the process of discovering and connecting to broadcast audio streams, as well as distributing broadcast encryption keys.
The specifications in BAP can be used to develop an LE Audio product; for example, with a unicast application (one source and one sink), BAP, ASCS, and PACS are used, and for broadcasting BAP, PACS, and BASS are used. One of the disadvantages of Classic Audio is the incompatibility between two devices when they have no common audio profile. With LE Audio, even if two devices have different top level profiles, they will still be capable of setting up an audio stream because both are compatible with BAP.
The specifications in the Rendering and Capture Control area of Figure 2 define what happens after an audio stream is set up. This includes volume control, control of multiple audio streams, and the pick-up of microphones. Managing gain on the audio sink device is defined by the Volume Control Profile (VCP), and the state of the gain is defined in the Volume Control Service (VCS). With so many connected devices and the potential for multiple audio streams, volume control is a complex topic. The Volume Offset Control Service (VOCS) acts as a balance control for adjusting the volume of multiple devices relative to one another. The Audio Input Control Service (AICS) allows multiple audio streams to be mixed together and rendered for playback.
The Content Control section in Figure 2 contains the specifications that define starting, stopping, answering, pausing, and selecting audio streams, the functionality necessary for the control of streaming audio. While they were embedded into HFP and AVRCP in Classic Audio, in LE Audio they are separated into two types: control for telephony and control for all other types of media. The Media Control Service (MCS) provides all the functions that are typical in content players today, as well as higher level functions, such as searching for tracks, modifying playing order, setting up groups, and adjusting playback speed. Telephony is handled by the Telephone Bearer Service (TBS), which supports multiple calls, call joining, caller ID, ringtone selection, and exposing call information.
The Transition and Coordination Control group in Figure 2 serves to tie the other specifications together. In cases where two or more Bluetooth LE Audio devices need to be used together, such as a left and right earbud, they are called a Coordinated Set. Members of a Coordinated Set always react together. The Coordinated Set Identification Profile (CSIP) and the Coordinated Set Identification Service (CSIS) work in conjunction to manage Coordinated Sets.
The Common Audio Profile (CAP) is the specification for starting, stopping, and updating unicast and broadcast audio streams. CAP is also able to use CSIS and CSIP in order to group devices.
Top Level Profiles are shown at the top of Figure 1. Top Level Profiles provide additional requirements for specific audio use cases. The Hearing Access Profile (HAP) and Hearing Access Service (HAS) define functionality for applications in the hearing aid ecosystem. The Telephony and Media Profile (TMAP) are specifications for the use of high quality codec settings and more complex media and telephony control. The Public Broadcast Profile (PBP) standardizes the broadcast feature to improve interoperability.
Isochronous Channels
One of the key features introduced in the Bluetooth Core 5.2 specification, and the backbone of LE Audio, are the LE Isochronous Channels (ISOC). “Isochronous” is defined as “happening at the same time”. In Bluetooth, it refers to data that is time bound and needing synchronized processing, such as the data in an audio stream. Audio data is essentially a stream of packets that must be rendered at specific times, and that stream is only valid for a certain amount of time. If the audio data doesn’t arrive in time, there will be a gap in the buffer at the receiving end.
ISOC supports both connection-oriented and connectionless, or broadcast, communication. With connection-oriented communication, each stream is termed a Connected Isochronous Stream (CIS). CIS’s that are synchronized, such as a left and right stereo stream, are set up as a single group called a Connected Isochronous Group (CIG). CIG’s support bidirectional data transfer. For connectionless communication, or broadcasts, data is streamed in a synchronized manner from a single source to multiple sinks. Each stream is called a Broadcast Isochronous Stream (BIS). A group of BIS’s is called a Broadcast Isochronous Group (BIG).
ISO Interval is the interval at which events occur, where each event is split into multiple subevents. The ISO Interval can range from 5 milliseconds to 4 seconds. For connection-oriented communication, the follower responds to data packets from the controller with its own packet. With connectionless communication, only the controller sends packets.
ISOC is capable of supporting data retransmission. With connection-oriented communication, data is retransmitted when the follower does not respond. With connectionless communication, the controller sends retransmissions without consideration of the follower’s action or inaction.
LC3 Codec
Bluetooth LE Audio features a new low-latency codec, LC3 (Low Complexity Communication Codec), with LC3plus, a higher quality codec, also being available for licensing. Both codecs were developed by the Fraunhofer Institute. LC3 and LC3plus are significant updates to SBC and aptX, codecs used in Classic Audio that were developed in the 80s. LC3 supports sampling rates of up to 48kHz, while LC3plus can reach 96kHz, both at bit depths of 16, 24, or 32-bits per audio sample. Both use much lower energy than current Bluetooth codecs. As illustrated in Figure 3, subjective listening tests conducted by Bluetooth and the Fraunhofer Institute showed LC3 to be nearly imperceptible from the reference uncompressed audio. LE Audio also supports alternative codecs; SBC and aptX can still be used, as well as proprietary codecs from various manufacturers.
Figure 3: Subjective audio quality listening test results. Source: Fraunhofer Institute
What is a codec?
A codec reduces the size of data, enabling it to be transferred more easily (or to take up less space in storage). An audio codec compresses, or reduces the size of, the audio file when it is transmitted or saved, and decompresses it for playback. Along with reducing the size of the data, a good audio codec maintains sound quality and minimizes complex computing.
Figure 4: LC3 codec block diagram. Source: soundonsound.com
LC3 and LC3plus are frame-based codecs, meaning that they analyze the audio in small sections (7.5 or 10 milliseconds, with 2.5ms available for LC3plus), and calculate a method of compression for each individual section. At the start of the algorithm, the uncompressed audio is processed by the Low Delay Modified Discrete Cosine Transform (LC-MDCT), which converts the signal into a time-frequency representation. The resulting signal is passed on to two noise shaping tools: the Spectral Noise Shaper (SNS), which shapes the quantization noise so that it is minimally perceived by the human ear, and the Temporal Noise Shaping Module (TNS), which reduces pre-echo artifacts on signals with sharp transients. After noise shaping, a Spectral Quantizer breaks the spectrum to a finite number of levels, estimating the number of bits required to encode it. Any artifacts generated during quantization are reduced with the Noise Level/Filling algorithm, which uses a pseudo-random noise generator to mask holes in the audio, and a Long Term Post Filter, which filters out any leftover coding noise.
Broadcast Audio (Auracast)
Broadcast Audio is a new feature introduced in Bluetooth LE Audio. Based on Isochronous Channels working in connectionless communication mode, Broadcast Audio enables audio data to be broadcast to an unlimited number of receiving devices. Broadcasts can be public or private. An example of a private broadcast is location-based audio sharing; one user shows a movie on a smartphone with friends watching, all listening on headsets via LE Audio. User authentication is handled by LE Audio. An example of a public broadcast would be the TV at a gym, where sound is usually muted. With LE Audio, users who want to listen to the sound can connect to the TV via Bluetooth.
Figure 5: Hearing Loop in an auditorium. Source: hearinglink.org
A hearing loop, or audio induction loop, is a sound system designed for use with hearing aids. A hearing loop is essentially an induction loop; wire is placed around the perimeter of a specific area, such as a church, conference room, or auditorium, which acts as an antenna and radiates a magnetic signal. The signal is picked up by hearing aids that are set to ‘T’ or ‘Telecoil’. With Broadcast Audio, expensive setups such as these can be eliminated, replaced with a simple Bluetooth LE transmitter and Bluetooth LE hearing aids, yielding higher audio quality at a much lower cost and offering additional features that cannot be provided by an induction loop.
nRF5340 SoC from Nordic Semiconductor
Figure 6: nRF5340 block diagram
The nRF5340 is Nordic’s flagship SoC (system-on-chip) for Bluetooth LE applications. The nRF5340 is a flexible platform with unique features well-suited to audio applications. The application processor runs at 128 MHz, powerful enough to run the LC3 codec at the highest quality level, and able to cover a sufficient amount of streams for most use cases. It features 1 MB Flash, 512 kB RAM, a floating-point unit (FPU), and DSP instruction capabilities. The network processor has its own core, clocked at 64 MHz. The nRF5340 supports Bluetooth LE and LE audio, mesh protocols such as Bluetooth mesh, and the ability to run both Thread and Zigbee concurrently with Bluetooth LE.
Because one of the most important components of digital audio is the clock, the nRF5340 features a new audio PLL (phase-locked loop), designed with LE Audio in mind. The clock can be tuned to receive RF audio packets, including Isochronous Channel anchor points, which are used to synchronize multiple receivers. This means that two receiving devices can play synchronized digital audio with extremely low jitter. This capability is a requirement for true wireless stereo devices, such as a pair of earbuds.
nRF5340 DK Audio Development Kit
Figure 7: Nordic nRF5340 DK Audio Development Kit
The nRF5340 Audio DK is a single board development kit for the nRF5340 SoC, containing all of the necessary features to develop and test a wireless audio device. Audio line-in is based on the CS47L63 from Cirrus Logic. The CS47L63 also includes a mono DAC, which provides a direct headphone output. The board is compatible with the Arduino UNO Rev3 and contains an SWF RF connector for direct RF measurements, NFC antenna, SEGGER J-Link OB programmer/debugger, user-programmable LEDs and buttons, and pins for measuring power consumption.
- Bluetooth: a technology for short-range wireless data exchange, using radio waves in the 2.4-2.48 GHz frequency range.
- Bluetooth LE : a technology for short-range wireless data exchange which provides significantly reduced power consumption when compared with Classic Bluetooth.
- Bluetooth mesh: a mesh networking standard that enables connectivity for large-scale device networks.
- Thread: an IPv6-based networking protocol, designed for low-power operation.
- Zigbee: a networking technology developed to enable low-cost, low-power machine-to-machine connectivity.
- Jitter: a time distortion in the clock signal causing the intervals between samples to vary in length, potentially degrading the original sound.
- Bluetooth Profile: a specification for wireless Bluetooth-based communication between devices. For two devices to be compatible with each other, both must support at least one of the same profiles.
- Word Clock (Clock): an electrical pulse that lets each device know when a sample occurs during A/D and D/A conversion.
- PLL (phase-locked loop): an electronic circuit with an oscillator that adjusts to synchronize the phase of its onboard clock with the frequency of an input signal.
- Codec: a device or algorithm that encodes and decodes a data stream or signal.
- LC3 (Low Complexity Communication Codec): an audio codec specified by the Bluetooth SIG for Bluetooth LE Audio.
- Generic Audio Framework: the part of the Bluetooth LE Audio architecture that contains the profiles and services that enable audio functionality.
- Isochronous Channels: a key feature introduced in Bluetooth Core 5.2 that enables the transmission of time-bound data, and any data needing synchronized processing, such as audio.
- System-on-a-Chip (SoC): a single integrated circuit containing all of the required parts and functionality for a specific device, such as a smart phone or wearable.
The latest generation of Bluetooth Audio, called Bluetooth LE Audio, improves on its predecessor in almost every way, including a more efficient codec, longer range, and lower power consumption. Additionally, LE Audio adds new functionality, such as Auracast Broadcast Audio, which opens the door to many new types of audio applications.
Dual-core Bluetooth SoC supporting Bluetooth LE, Bluetooth mesh, MFC, Thread, and Zigbee
Take the QuizBack to Top
Are you ready to demonstrate your Bluetooth LE audio knowledge? Then take this 10-question quiz. To earn the Wireless Protocol V Badge, read through the module, attain 100% in the quiz, and leave us some feedback in the comments section.