This glossary contains a culmination of terms related to the reproduction of audio through the source chain, other audio and hearing-related terms, and audio gear terms. While there may be different definitions for studio engineers, musicians, and audiologists, this glossary was designed to help audiophiles and audio enthusiasts better understand the terminology used within reviews on THL and other audio review sites.
#
3D: A recreation of sound that has a well-defined space with a balance between the Soundstage width, depth, and height. For headphones earphones, and speakers, 3D represents the ability to place the point source of various sounds/instruments/notes accurately within space.
A
Accurate: The music does not impart bias on the sound; ideally, it sounds identical to the original music. Usually reserved for studio equipment for sound mixing. The opposite of colored.
ADSR: ADSR stands for Attack, Decay, Sustain, and Release and is used to describe musical notes. Different instruments have different ADSR contours, so they sound different when playing the same note. ADSR, in relation to product reviews of headphones, earphones, earbuds, IEMs, and speakers on THL, is in the context of the driver’s capabilities to convert the electrical signal supplied by the source equipment to a sound wave. Producers can change the ADSR to increase track coherency and adjust the loudness of a track, among other changes (see more here). The contour of an ADSR envelope is specified using four parameters:
- Attack Time: The time taken for the initial response time for the leading edge of the note to full volume. From a driver’s perspective, the response time once the electrical signal has been provided. A quick attack enables a driver to keep up with fast notes and sound immediate while a slow attack can sound muddy and reduce detail.
- Decay Time: Once the note reaches the peak, it transitions to decay. Decay time is the time taken for the note to reach the sustain level. Too short of decay leads to a thinner, more analytical sound and is usually combined with a quick attack. When a bass driver has too short of a decay it will lack the ability to deliver the power of deep bass notes. Too long of a decay and the sound can sound smeared, slow, and unnatural.
- Sustain Level: The level during the main sequence of the sound’s duration. The ability to sustain notes if the electrical signal continues and recreate multiple notes simultaneously. A driver poor at sustaining notes will lack deep bass reverberation and will become muddy.
- Release Time: The time taken for the level to decay from the sustain lever to zero after the electrical signal stops. This can be called driver damping. Too quick of a release causes notes to lack body, while too slow of a release causes notes to become muddy and bloated.
Aggressive: Forward and bright sonic character with a quick attack and/or decay.
Airy: Spacious and open, sounding like a large space. Typically represents extended and balanced treble.
Analytical: Fast note decay that accentuates details within the music.
Articulate: Intelligibility of voice(s) and instruments and the interactions between them.
B
Balanced: One aspect of the sonic spectrum is not emphasized above the rest.
Bass Enhanced (Basshead): The bass region is emphasized in comparison with the midrange and treble regions of the frequency spectrum.
Bass Impact: Highly dynamic bass notes with a quick leading edge and high amplitude that provide a perception of power and feel.
Bass Heavy: Emphasized bass that overtakes the rest of the spectrum, often drowning out the midrange and treble.
Boomy: Excessive bass mid-bass, poorly damped low frequencies, and/or low-frequency resonances.
Bright: The upper midrange/lower treble is emphasized in comparison with the midrange.
C
Clear: A combination of articulation, detail, and air representing accurate ADSR.
Clinical: Clean, clear sound, often analytical but uninvolving and lacking dynamics.
Coherent: How well the sound is integrated across the spectrums. All gear reviewed is expected to have a high level of coherence, but coherence between multi-driver speaker, headphone, and earphone systems can occur.
Cold: Prominent or emphasized higher frequencies lacking bass and/or mid-bass, often combined with lower dynamics; lacking warmth.
Congested: Slower driver response leading to smeared, muddy, and flat sounds that lack transparency.
Consumer Sound: Sound signature with exaggerated bass combined with low treble or an accentuated treble with little detail. Many mainstream lower-end audio products use a consumer sound as it can be pleasing to the ear and hide flaws in music tracks, sources, and drivers. Due to the prevalence of consumer sound, it may also be familiar and how many think audio should sound.
Crisp: Clear sound that has well-defined edges and detail articulation with an analytical emphasis.
D
Dark: Attenuated high frequencies that accentuate lower frequencies. The opposite of bright.
Detail Articulation: Presentation characteristic that emphasizes the details within the music over vs. the overall presentation. Usually, one of the characteristics of an analytical sound signature that can be the result of quicker note decay and release (see ADSR).
Detailed: Increased perception of the nuances and subtleties of the audio, increasing emphasis on the details and textures within the music or audio content. This can be achieved via frequency response tuning, such as an increase in the upper midrange, or through higher resolution with a well-defined presentation. Detail is an aspect of the presentation not to be confused with Resolution, which is the actual level of detail recreated.
Digital Audio Player (DAP): An all-in-one piece of equipment that incorporates digital storage, a DAC, and an amplifier amp along with a user interface that enables the user to control playback and features.
Digital to Analog Converter (DAC): Equipment that converts digital information to analog signals so drivers can convert the analog signal to sound waves. DACs can utilize different methods to convert digital information to analog electrical waveforms [add more info] .
Dynamic: How well transients (changes in volume) are handled and how “punchy” the presentation can be. For speaker, headphone, and earphone drivers dynamics are dependent on ADSR characteristics.
E
Edgy: Excessive, often harsh treble and/or too short of note decay and release times. Can result in sibilance. The opposite of Smooth.
Engaging: Feeling of being connected to the sound recreation. A result of sound characteristics that synergistically fit together and work well for the music genre.
Enveloping: When earphones or headphones seamlessly integrate you into the headspace presentation.
F
Fast: Highly responsive note recreation to the leading edge providing a crisp, articulate recreation of fast transients.
Fluid: The feel of smooth, flowing sound that blends and mixes all the frequency spectrums and notes together seamlessly. See Liquid.
Forward: Presentation characteristic providing the perception of being close to the performance. Often the result of mid-forward sound signatures. Opposite of Laid-Back.
Full: Synonymous with warm, but tending to have additional weight to notes.
I
Imaging: The depth, separation, and localization of individual sounds within the presentation. When listening to recorded audio, the imaging is dependent upon the spatial queues captured within the recording.
L
Laid-Back: Presentation characteristic giving the perception that the listener is further back from the performance, often combined with a smooth delivery. Often the result of a V-shaped sound signature.Opposite of Forward.
Liquid: Overall presentation that is smooth, coherent, and free from edginess with excellent transient response.
Listening Location: Where the listener is placed relative to the soundstage presentation. For example, the component put the listener on the stage with the performers, in the front row of the audience, in the back of the venue, etc.
Lush: Rich, warm, and full presentation.
M
Mid-Centric: Emphasis on the midrange of the frequency spectrum with relatively recessed bass and treble.
Mid-Forward: Presentation characteristic where the midrange is brought closer to the listener. For example, the perception that you are right in front of the singer while the instrument players are further away. Different from Mid-Centric in that mid-centric sound characteristics don’t always have a mid-forward presentation.
N
Nasal: Frequency response, usually a bump at 600 Hz followed by a dip, that causes vocalists to sound like they have a stuffy nose.
Natural: Reproduction of sound that matches real-life sounds taking into account tone, timbre, frequency response, and dynamics. See Organic.
Neutral: Free from coloration. For earphones, headphones, and speakers add the ability to accurately keep up with source signal ADSR.
O
Open: Combination of large soundstage and airy sound.
Organic: Rich, liquid sound with warmth, accurate tone, and free of distortion and artifacts. See Natural.
P
Pace, Rhythm, and Timing (PRaT): Represents the most basic building blocks of musical performance.
Powerful: Ability for earphones, headphones, and speakers to produce high SPL levels with a clean, dynamic sound that includes ample bass delivered with high impact.
Presentation: The overall space recreated by the headphone/earphone/speaker driver(s) from a size and shape standpoint. For headphones and earphones, presentation is also called headstage as it is often contained in and around your head. The presentation from low-quality speakers will emanate from a point source while ultra-high-end speakers will present musicians and instruments at specific locations in space relative to you and other instruments.
Punchy: Good reproduction of dynamics. A driver’s ability to keep up with the attack and possibly have a shorter-than-ideal decay, lower-than-ideal sustain, and quick release. Good transient response, with strong impact.
R
Reference: Sound signature designed with a balanced frequency response, of the perception of “flat” in the design of the equipment, with emphasis on detail enabling the presentation of details used for mixing and track mastering, often resulting in a somewhat analytical or clinical sound.
Resolving: Ability to present a high level of detail within the presentation including notes and the space between notes that define the presentation space. “Detail” and “resolution” differ in that detail is the psychological effect which can be falsely presented by accentuated treble while resolution is the actual creation of information. In video, a screen can resolve 4K, but the source may be upscaled HD, or a HD screen can use video processing to improve sharpness of the signal.
Resolution: The actual information recreated by the gear. For video there are set resolution levels such as HD, 4K, and 8K, but there is no set resolution in audio gear except for ADCs and DACs. For speaker, headphone, and earphone drivers, resolution is dependent upon to many factors to list, but includes diaphragm material, diaphragm travel, diaphragm damping, enclosure pressure, etc.
Rich: Additional weight to notes that provides a warmer feel across the frequency spectrum, often due to slightly elongated decay and/or release time (see ADSR).
S
Separation: How well sounds are isolated from other sounds within the overall presentation. For example, notes be separated from each other without great detail, or can be highly detailed without clear distinction between the notes making it difficult to clearly “see” where they are placed within the presentation. See Resolving.
Smooth: Lack of harshness caused by quick transient response at the expense of detail articulation. Similar to Liquid, but liquidity doesn’t reduce the resolution and/or detail levels.
Sound Levels: Pressure amplitudes of sounds. Humans can hear sound levels ranging from the lowest audible level, typically called the Threshold of Hearing, which will differ based on the person and hearing loss, but the recognized level is a pure tone at 1000 Hz that has a pressure amplitude of 20 μPa and an intensity of 1 pW/m2. Sound levels are typically expressed in decibels (dB). See below for a sound chart with sound levels for common sounds.
Sound1
Hearing threshold, 1,000 Hz (healthy)
Normal breathing, quiet forest
Rustle of leaves, watch ticking
Soft whisper, recording studio
Light rain, library, quite office
Moderate rainfall, large office, refrigerator
Normal conversation, sweing machine
TV audio, coffee grinder, freeway traffic
Busy traffic, ringing phone, kettle whistle
Shouting, hair dryer, blender
Helicopter, snow blower, loud radio
Baby crying, leaf blower, symphony concert
Police/ambulence siren, thunder
Live rock band, jet engine, jackhammer
Threshold of pain2, airplane takeoff at 100′
Sound Level (in dB)
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
Sound Intensity (in pW/m2)
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
1,000,000,000
10,000,000,000
100,000,000,000
1,000,000,000,000
10,000,000,000,000
100,000,000,000,000
Pressure Amplitude (in μPa)
20
60
200
600
2,000
6,000
20,000
60,000
200,000
600,000
2,000,000
6,000,000
20,000,000
60,000,000
200,000,000
1 for reference purposes. Actual measurements may vary based on distance.
2 Threshold of pain can start at 125dB
Sound Signature: A classification of the sound characteristics of a piece of gear indicating the flavor or coloration. Sound signatures can be likened to flavors. For example, chocolate ice cream from two different brands will taste like chocolate but will usually have differences to their flavor makeup (taste). The below sound signatures are used on this site to classify gear reviewed:
- Analytical: Emphasis on bringing details to the forefront of the presentation. Typically accompanied by a brighter perception and faster response. The opposite of warm & smooth.
- Balanced: No emphasis on any part of the frequency spectrum. Also can be thought of as “flat,” but since there is no true flat sound signature, the tuning is meant to provide a balance across the frequency spectrum.
- Basshead: Emphasis on the bass region of the frequency spectrum, enhancing the bass other frequencies. Also referred to as consumer sound, as the general consumer prefers enhanced bass and many western mainstream listeners have come to expect enhanced bass, believing that is how music should sound.
- Bright: Emphasis on the treble and/or upper midrange region of the frequency spectrum. The bright sound signature is relatively rare as bright presentations are usually due to an analytical sound signature, but gear can be tuned to have a smoother presentation while still having a treble emphasis.
- Flat: Not used as THL believes Balanced is a better term for this sound signature.
- Mid-Centric: Emphasis on the midrange, usually recessing the bass and treble at least slightly. Mid-centric sound signatures usually emphasize vocals and bring the listener closer to the performer.
- Neutral: This sound signature is defined primarily by the transparency of the gear, which allows for significant changes to the sound signature based on the source material mastering sound signature, but can have select sound signature characteristics and is rare. For example, Neutral gear could sound warm and smooth on one track and analytical on another depending on the mastering. Another way to think of this gear is that it acts like a chameleon and can change its ‘color’ based on the environment.
- Reference: Neutral frequency response designed for studio equipment, typically having a slightly analytical sound vs. warm and smooth. Designed to not add coloration and can sometimes be considered bland.
- V-Shaped: Emphasis on the bass and treble. This is an offshoot of the Basshead consumer sound where the treble emphasis provides the perception of additional detail and clarity.
- W-Shaped: Tuning that adds color to emphasize bass, midrange, and treble, differing from the balanced sound signature which will more closely approximate “flat.”
- Warm & Smooth: Fuller sound that lacks sharp edges. The result of emphasis on the mid-bass region of the frequency spectrum and/or driver tuning that takes the edge off the attack and has a slightly longer typical decay and release (see ADSR).
Sound Characteristics: Traits of the sound recreation for the gear. Sound characteristics are defined throughout this glossary.
Soundstage: The recreation of a space including width, depth, and height and the proportions between them. For headphones and earphones, soundstage can also be called headspace.
Source Chain: All of the components in the reproduction of sound including:
- Source: Sources take digital or analog recorded media and convert it to analog electrical signals. For example, a record player translates information from the record grooves to an electrical signal. A Digital-to-Analog Converter (DAC) converts a digital track from zeros and ones to an analog signal. The source can be as simple as a single Digital Audio Player (DAP) with built-in DAC and amplifier or multiple separate components such as a CD player, DAC, pre-amp, and amp.
- Cable: Cables are needed to transmit the signal from one component to another. In the simple setup above, a single cable would work while in the separate component example, both digital and analog interconnect cables would be needed as well as a cable to connect to speakers or headphones.
- Sound wave recreation: Driver(s) that convert the electrical signal to sound waves, including headphones, earphones, and speakers.
Spacious: Presenting with a large soundstage or headstage.
Sterile: Lacking dynamics.
T
Textured: The ability to hear the fine details within a sound, similar to fine texturing within ancient Roman buildings. See Resolving and Detailed.
Thick: Enchanged mid-bass and/or slower ADSR response time that, while adding warmth, limits detail reproduction and clarity.
Thin: Lacking bass sustain capability and/or recessed mid-bass.
Threshold of Hearing: A barely audible pure tone at 1000 Hz that has a pressure amplitude of 20 μPa and an intensity of 1 pW/m2, which is considered 1 dB for healthy human hearing.
Threshold of Pain: The intensity level of sound that gives pain to the ear, typically between 115 and 140dB.
Tight: The ability of drivers to stop immediately once the signal has been removed.
Tonality: How close the overall sound is to neutral and natural. Each component has it’s own tonality, and one typically sounds more like real instruments than others.
Transparent/Transparency: The ability for the component to not impart any coloration regardless of the track or recording style; recreation of the source signals without adding to it. A completely transparent piece of audio equipment may sound thick on one track, thin on another, and somewhere between on a third all based on how the source tracks were recorded. More transparent speakers, headphones, and earphones will take on the source chain’s coloration, if any. Less transparent speakers, headphones, and earphones will impart their own coloration to tracks and source components, shifting the sound.
U
Uncolored: Lacking coloration in any manner based on the designers reference. See Flat, Natural, and Balanced.
Up-Front: Presentation that brings the listener closer to the presentation. Can be the result of a smaller soundstage or tuning. Different than Mid-Forward as the entire Up-Front presentation is closer to the listener. The opposite of Laid-Back.
V
Veiled: Treble lacking detail, usually with a lower amplitude producing a resulting sound perceived as having a veil between the presentation and listener.
W
Warm: Enhanced mid-bass.
Weighty: Sound that adds extra to bass notes, even if the bass isn’t accentuated, caused by a longer sustain and release (see ADSR).
Wide: Presentation with a large soundstage or headstage width, but lacking matching depth.