Hearing Aid Research Data Set for Acoustic Environment Recognition (HEAR-DS)

HEAR-DS provides binaural audio material recorded in acoustic environments typical of hearing aid users. Its goal is to help researchers train and test algorithms in environments relevant to hearing aids. A particular focus is on machine learning approaches such as DNN.

Please cite this work with DOI
10.1109/ICASSP40776.2020.9053611:
Hearing Aid Research Data Set for Acoustic Environment Recognition https://ieeexplore.ieee.org/document/9053611
(Andreas Hüwel, Dr. Kamil Adiloğlu and Dr. Jörg-Hendrik Bach), published at ICASSP2020

Download
HEAR-DS download link

Parts of HEAR-DS
HEAR-DS consists of this parts, for each its licensing see LICENSE.txt in subfolders:

HEAR-DS/RawAudioCuts
HEAR-DS/AudioSnippets
HEAR-DS/Code

Further details see
HEAR-DS README.txt

"Your browser may have problems with correctly showing the tree structure used in the readme.txt file. For this reason, please download the readme.txt and open it in an appropriate editor."

Overview Acoustic Environments

Cocktail party
Interfering speakers
In traffic	Speech in traffic
In vehicle	Speech in vehicle
Music	Speech in music
Quiet indoors	Speech in quiet indoors
Reverberant environment	Speech in reverberant environment
Wind turbulence	Speech in wind turbulence

Example of Speech in Background SNR Variations

Acoustic Environment
Speech in vehicle	SNR -10	SNR -5	SNR 0	SNR 5	SNR 10

As described in the paper, some audio material was used by third parties and therefore cannot be provided here. But all the required data is accessible online. With the scripts we provide, anyone can regenerate the entire data set themselves.

The audio material for the noise is from CHiME5 and the speech mixing material for speech in background environments is from CHiME2. For CHiME2 (2013) and CHiME5 (2018), please contact the organizers for access to the datasets. Audio for music is from GTZan.

Data and Format

An acoustic environment contains audio from different recording situations. Each recording situation has a unique ID (rec_id) that contains one or more recording sessions. From the raw audio of each recording session, we manually cut appropriate pieces of audio (the cuts) to fill each recording situation with audio, where each cut has a local unique cut_id. To generate the actual dataset for training machine learning systems, we performed another processing step that generates all 10s of audio samples for each acoustic environment, as further described in the Audio Samples subsection.

HEAR-DS Raw Audio Cuts
For each recording situation, one folder contains all the cut wav files.

Folder structure of the HEAR-DS samples:
For details see
HEAR-DS README.txt

Due to the manual process of audio editing, the length of the cuts varies. The naming scheme is:
rec_id__cut____.wav

With being a 3 digit number and a 2 digit number. The could e.g. be "startengine_driveoff" for InVehicle or "bell" in ReverberantEnvironment. stands for one of the used hearing aid microphones [Mic_BTE_L_front, Mic_BTE_L_rear, Mic_BTE_R_front, Mic_BTE_R_rear, Mic_ITC_L, Mic_ITC_R]. is the name of the used audio-exporter, currently "raw_48kHz32bit".

Hear-DS Audio Samples

In this processing step, the raw audio snippets were further decomposed into 10s snippets. These 10s snippets are either used directly as background samples or further mixed with random speech at different SNRs to create audio samples for speech in background environments. The binaural speech source material comes from five different directions that we randomly select, the start and end time of this source speech and the start time of the background snippet are also randomly selected. Finally, these 10-samples form the HEAR-DS audio material for training machine learning systems, e.g., as input for the feature extraction step of deep neural networks.

Audio sample snippet file format
The naming scheme for snippets is:
<ENV_ID>_<REC_ID>_<CUT_ID>_<SNIP_ID>_<TRACKNAME>_<SAMPLERATE>.wav

<ENV_ID>: 2 digit id of acoustical environment, where each speech in background environment has its own id, separated from the pure background environment.
<REC_ID>: 3 digit id of record situation.
<CUT_ID>: 2 digit id of cut of the record situation (unique for all sessions of that situation)
<SNIP_ID>: 3 digit id of the snippet of this cut.
<TRACKNAME>: as described above.
<SAMPLERATE>: in [48kHz, 16kHz]

For example, for Reverberant Environment, recording situation "Oldenburg Church", first cut, first snippet the 16kHz version the snippet filename is 06_005_00_000_BTE_L_front_16kHz.wav

Details see
HEAR-DS.README.txt

Aknowledgements

This work was supported by the German Federal Ministry of Education and Science (BMBF), FZK 02K16C202 AUDIO-PSS.

The authors would like to thank Marei Typlt and the partners in the AUDIO-PSS project for their support in designing the acoustic environments and Audifon GmbH for providing the hearing aid dummies.

Not found the right thing?

Our complete offer

We are also happy to help you personally!

Write us

Privacy settings

We (Hörzentrum Oldenburg gGmbH) and our partners use cookies to deliver, maitain and continously improve our website for you. Please give your consent for using cookies, as described in our cookie notice, by clicking on “Accept all and continue” to have the best user experience on our website.

LOGIN

Hearing Aid Research Data Set for Acoustic Environment Recognition (HEAR-DS)

Overview Acoustic Environments

Example of Speech in Background SNR Variations

Privacy settings