OSCAAR Logo

OSCAAR

The Online Speech/Corpora Archive and Analysis Resource

Overview

The following collections of speech recordings are available on OSCAAR and may be accessed via our online request form. To learn more about each collection, please click on the name of the collection.

Speech Communication Research Group, Northwestern University

http://groups.linguistics.northwestern.edu/speechcommgroup/allsstar/

133 total talkers of 31 native language backgrounds producing the following in English and, where applicable, in subject's L1: - 120 Hearing in Noise Test (HINT) sentences - 20 sentences pulled from the United Nations Declaration of Human Rights - 30 sentences pulled from Le Petit Prince - The North Wind and the Sun passage - 4 spontaneous storytelling passages - 1 spontaneous speech passage, approximately 5 minutes long

Speech Communication Research Group, Northwestern University
  • 5 talkers, all female native speakers of American English
  • 336 sentences per talker (21 lists, 16 sentences per list, 50 keywords per list)
  • RMS amplitude equalized (using Praat)
Speech Communication Research Group, Northwestern University
  • 2 talkers (1 male, 1 female)
  • 64 sentences (BKB lists 7, 8, 9, 10)
  • 2 speaking styles (plain, clear)
Speech Communication Research Group, Northwestern University
  • 100 talkers (63 male, 37 female) from 6 native language backgrounds (American English, Brazilian Portuguese, Hindi, Korean, Mandarin, Turkish).
  • Each talker participated in a total of two diapix conversations (trial 1 and trial 2).
  • A total of 96 conversations distributed across 3 conditions as follows:
    • Condition 1 = Trial 1 and trial 2 recorded in English with no change of talkers.
    • Condition 2 = Trial 1 recorded in the talkers' shared native language and trial 2 recorded in English with no change of talkers.
    • Condition 3 = Trial 1 recorded in English with a non-native talker from the same native language group and trial 2 recorded in English with a different non-native talker from the same native language background.
    • Native speakers of American English only recorded conversations in American English (Conditions 1 and 3 only).
  • Conversations were constrained by a spot-the-difference puzzle in which pairs of participants verbally compared two scenes, only one of which was visible to each talker (the Diapix elicitation technique).
Speech Communication Research Group, Northwestern University
  • 2 female talkers (20, 22), both native speakers of American English
  • 200 IEEE sentences per talker
  • Speaking style: plain
  • RMS amplitude equalized
Speech Communication Research Group, Northwestern University
  • 4 talkers (2 male, 2 female)
  • 480 bisyllabic words
  • Plain speaking style
Speech Communication Research Group, Northwestern University
  • 10 native English speakers (5 male, 5 female) each produced 112 English sentences (BKB lists 1, 2, 3, 4, 11, 13, 14)
  • 10 native Korean speakers (5 male, 5 female) each produced 112 Korean sentences (translations of the English BKB sentences) and 112 English sentences (BKB lists 1, 2, 3, 4, 11, 13, 14)
Speech Communication Research Group, Northwestern University
  • Multi-language snippets [ML]: 2 brief samples (~ 2 seconds each) of paragraph readings (North Wind and Sun) by 1 male talker in 17 languages (extracted from the IPA website)
  • Multi-English snippets [ME]: 1 brief sample (~ 2 seconds each) of an English paragraph as read by 1 male talker from each of the 17 languages in the multi-language snippets collection (extracted from http://accent.gmu.edu/)
Speech Communication Research Group, Northwestern University
3 Tasks
  1. NU - IEEE Dutch (ID)
    • 200 IEEE sentences, translated into Dutch.
    • 2 talkers (2 female)
  2. NU - SNST Dutch (SD)
    • 200 SNST sentences (as in SNST English 2009), translated into Dutch.
    • 2 talkers (2 female; same for ID)
  3. NU - BKB Dutch (AD)
    • 336 sentence) translated into Dutch.
    • 1 talker (1 female; different from ID and SD)
Speech Communication Research Group, Northwestern University
6 Subcollections
  1. Croatian Nonsense Sentences - CNS
    • 5 talkers (2 female, 3 male)
    • 20 semantically anomalous Croatian sentences
  2. Croatian Meaningful Sentences - CMS
    • This sub-collection is presently unavailable on OSCAAR.
  3. Croatian Paragraphs - CPA
    • 4 talkers (2 female, 2 male)
    • 3 tasks with a total of 96 unique sentences (16 sentences in task 1, 20 sentences in task 2, 32 sentences in task 3; all done twice in 2 speaking styles)
    • This sub-collection is presently unavailable on OSCAAR.
  4. English Nonsense Sentences - ENS
    • 2 lists of 20 semantically anomalous English sentences
    • 5 talkers (3 female, 2 male, + 1 male speaker in plain style only [minor word order difference])
  5. English Meaningful Sentences - EMS
    • 6 talkers (3 female, 3 male)
    • 20 meaningful sentences
  6. English Paragraphs - EPA
    • 2 paragraph passages
    • 6 talkers (3 female, 3 males)
    • The paragraphs also exist in short segments which have been used for intelligibility testing. (Contact the owner of this collection for details).
Speech Communication Research Group, Northwestern University
External website
Private Collection. Access limited to SCRG lab group and collaborators.
  • NUFAESD: The Northwestern University Foreign-Accented English Speech Database
  • 64 simple English sentences (BKB lists 7, 8, 9, 10)
  • 32 talkers from various native language backgrounds: Chinese (n=20), Korean (n=5), Bengali (n=1), Hindi (n=1), Japanese (n=1), Romanian (n=1), Slovakian (n=1), Spanish (n=1) and Thai (n=1)
  • For each talker, the database has:
    • demographic information
    • a sentence production score (i.e. average sentence intelligibility when presented to native English listeners mixed with broadband noise at +5 dB SNR in a multiple talker format)
    • a sentence-in-noise perception score (i.e. average sentence recognition accuracy in response to naturally produced English sentences by either a male [16 of the NUFAESD talkers] or female talker [the other 16 of the NUFAESD talkers]. Sentences were presented in two speaking styles [plain versus clear] and at two signal-to-noise ratios [-4 versus -8 dB]; see Clear Speech 2002 collection.)
Speech Communication Research Group, Northwestern University
External website
Private Collection. Access limited to SCRG lab group and collaborators.
  • 72 speakers:
    • 23 native English speakers (20 used in dissertation)
    • 20 native Mandarin speakers (20 used in dissertation)
    • 20 native Korean speakers (20 used in dissertation)
    • 9 bilingual speakers
  • Sentences recorded in 3 focus contexts:
    • Subject narrow focus
    • Verb phrase broad focus
    • Sentence broad focus
  • Recordings from the Prosodic Focus Marking Prominence Production and Placement Experiments are available on OSCAAR in original, unsegmented form.
Speech Communication Research Group, Northwestern University
9 speakers (5 female, 4 male) recorded, with 10 recordings each:
  • 5 paragraphs designed to have repeated mentions of words
  • 2 speech styles (clear and plain)
Speech Communication Research Group, Northwestern University
  • 2 female talkers, both native speakers of American English
  • 200 SNST (Syntactically Normal Sentence Test, semantically anomalous) sentences per talker
  • Speaking style: plain
Speech Communication Research Group, Northwestern University
  • Dialect: New Zealand English
  • Speakers: Native and non-native speakers
    • 8 Native New Zealand English (4 male, 4 female)
    • 8 Non-native New Zealand English - Native Korean (4 male, 4 female)
    • 8 Non-native New Zealand English - Native Mandarin (4 male, 4 female)
Speech Communication Research Group, Northwestern University
  • 1 talker (female, native speaker of American English)
  • 2 styles (plain, clear)
  • 2 lists of 60 high (HP) and 60 low (LP) predictability sentences:
    • List 1 = 60 HP + 60 LP sentences for native listeners (Fallon, Trehub, & Schneider, 2002)
    • List 2 = 60 HP + 60 LP sentences for nonnative adult listeners (Bradlow & Alexander, 2007)
Speech Communication Research Group, Northwestern University
  • 85 talkers (49 male, 36 female) from 13 native language backgrounds (English, Mandarin Chinese, Hindi/Marathi, Italian, Japanese, Korean, Macedonian, Persian, Russian, Spanish, Telugu, Thai and Turkish).
  • All talkers participated in an unscripted, Diapix task and recorded a set of scripted materials (except one male participant who did not participate in a Diapix task).
  • Unscripted Diapix task (a total of 42 conversations):
    • 4 conversations between native speakers of Korean (N-N) in Korean.
    • 8 conversations between native speakers of English (N-N) in English.
    • 11 conversations between non-native speakers of English from the same native language group (NN1-NN1) in English.
    • 11 conversations between non-native speakers of English from different native language groups (NN1-NN2) in English.
    • 8 conversations between a native speaker of English and a non-native speaker of English (N-NN) in English.
    • Conversations were constrained by a spot-the-difference puzzle in which pairs of participants verbally compared two scenes, only one of which was visible to each talker (the Diapix elicitation technique).
  • Scripted recordings
    • 62 words in English.
    • 60 sentences in English.
    • 3 passages in English.
    • Native speakers of Korean also recorded a set of 72 words and 2 passages in Korean.
Speech and Auditory Research Lab, Queens College
3 Tasks
  1. Basic English Lexicon (BEL) sentence reading
    • 20 lists of 25 sentences each (500 sentences total)
    • 3 talkers (2 female, 1 male; same for HINT and QSIN)
  2. HINT sentence reading
    • 25 lists of 10 sentences each (250 sentences total)
    • 3 talkers (2 female, 1 male; same for BEL and QSIN)
  3. Basic English Lexicon (BEL) sentence reading
    • 18 lists of 6 sentences each (108 sentences total)
    • 3 talkers (2 female, 1 male; same for BEL and HINT)
Speech Perception Lab, Indiana University
27 total talkers of 7 native language backgrounds (Spanish, Mandarin, Korean, Japanese, German, French, English) each produced a total of English 1139 recordings in the following tasks:
  • 160 Hearing in Noise Test for Children (HINT-C) sentences
  • 10 Digit words
  • 48 Multi-syllabic Lexical Neighborhood Test (MLNT) words
  • 50 Northwestern University-Children's Perception of Speech (NU-CHIPS) words
  • 100 Lexical Neighborhood Test (LNT) words
  • 50 Lexical Neighborhood Sentence Test (LNST) sentences
  • 40 Pediatric Speech Intelligibility-Sentences (PSI-Sentences) sentences
  • 20 Pediatric Speech Intelligibility-Words (PSI-Words) words
  • 339 Bamford-Kowal-Bench (BKB) sentences
  • 150 Phonetically Balanced Kindergarten (PB-K) words
  • 72 Spondee (Auditec Spondees) words
  • 100 Word Intelligibility by Picture Identification (WIPI) words
Speech Research Lab, Indiana University
  • 100 IEEE/Harvard sentences recorded by 21 native American English talkers (11 male, 10 female)
  • Associated intelligibility data from: 10 listeners per talker in the clear (no noise) single talker presentation format (randomized for each listener)
Speech Research Lab, Indiana University
  • 75 "easy" + 75 "hard" monosyllabic words
  • Recorded by 10 native speakers of American English
  • Each speaker recorded words at 3 speaking rates (slow, medium, fast)
Speech, Hearing & Phonetic Sciences, University College London

kidLUCID = London UCL Clear Speech in Interaction Database Project title: Speaker-controlled Variability in Children's Speech in Interaction (A research project funded by the ESRC) - 96 talkers (all native southern British English speakers, 46 male, 50 female). Recorded in pairs. - Conversations were constrained by a spot-the-difference puzzle in which pairs of participants verbally compared two scenes, only one of which was visible to each talker (the Diapix elicitation technique). - Each talker participated in total of six Diapix tasks consisting of three conditions. - A total of 288 conversations distributed across three conditions as follows: - NOB (No barrier): 96 conversations while they both heard each other normally. - BAB (Babble): 96 conversations where one conversational partner heard the other's speech in a background for multi-talker babble at an approximate SNR of 0 dB. The talker hearing the babble was a confederate. Exceptions: 4 conversations with CBB ('child multitalker babble' as opposed to 'adult multitalker babble') - VOC (Vocoded): 96 conversations where one conversational partner heard the other's speech after it had been processed in real time through a noise-excited three channel vocoder.

Speech, Hearing & Phonetic Sciences, University College London
  • LUCID = London UCL Clear Speech in Interaction Database
  • 40 talkers (all native southern British English speakers, 20 male, 20 female).
  • Each talker participated in a total of twelve Diapix tasks in different conditions. Conversations were constrained by a spot-the-difference puzzle in which pairs of participants verbally compared two scenes, only one of which was visible to each talker (the Diapix elicitation technique).
  • Each talker additionally recorded two sessions of read sentences (one in a clear speaking style and another in a plain speaking style) and two sessions of picture naming (one in a clear speaking style and another in a plain speaking style).
  • A total of 300 DiapixUK paired participant conversations distributed across 4 conditions as follows:
    • DiapixUK: No barrier. 60 conversation recordings = 20 pairs of talkers recorded 3 conversations each while they both heard each other normally. TextGrids available for both speakers.
    • DiapixUK: Vocoder condition (VOC). One conversational partner heard the other's speech after it had been processed in real time through a noise-excited three channel vocoder. Each talkers heard vocoded speech for three picture scenes. 120 conversation recordings = 20 talker pairs recorded 6 conversations each. TextGrids available for both speakers.
    • DiapixUK: Babble condition (BAB). One conversational partner heard the other's speech in a background for multi-talker babble at an approximate SNR of 0 dB. The talker hearing the babble was a confederate. 60 conversation recordings = 20 talkers recorded 3 conversations each with a confederate. TextGrids available for both speakers.
    • DiapixUK: L2 condition (L2). One conversational partner (a confederate) was a low-proficiency second-language speaker of English. 60 conversational recordings = 20 talkers recorded 3 conversations each with an L2 confederate. TextGrids available for the native speaker of British English.
    • All talkers completed DiapixUK: 'No barrier' and DiapixUK: 'Vocoder'. Half of the talkers completed DiapixUK: 'Babble' and the other half completed DiapixUK: 'L2'.
  • A total of 80 sentence reading session recordings (40 talkers each recorded one session of read sentences in a clear speaking style and another session in a plain speaking style).
  • A total of 80 picture naming session recordings (40 talkers each recorded one session of picture naming in a clear speaking style and another session in a plain speaking style).
SoundLab, Northwestern University

The role of linguistic experience in the processing of probabilistic information during production. 39 speakers from two groups. 17 L1 speakers (13 females) and 22 L2 speakers (17 females). Speakers described a series of events (e.g., The candle rotates). 144 trials included at least 2 events each. TextGrids include RT, determiner, noun, and verb measurements for the third description of (correct) target trials (96 per participant).

SoundLab, Northwestern University

Thirty-four native English speakers (21 women) from the Northwestern University community participated. These individuals reported no history of speech or language impairment.

Tongue twisters were composed of syllables with initial consonants contrasting in voicing (e.g., post-boast). Forty-eight pairs of syllables were selected, evenly distributed across labial (/p/, /b/), alveolar (/t/, /d/) and velar (/k/, /g/) place of articulation. For each syllable, four tongue twisters were generated, crossing syllable order (switching, ABBA vs. repeating, ABAB) and which member of the pair was placed first (e.g., within ABBA, post boast boast post vs. boast post post boast).

Each target sequence was presented to participants on a computer screen in a sound-attenuated room. Productions were recorded using a head-mounted microphone. Participants practiced each tongue twister once slowly (1 syllable/second) and then repeated it three times quickly (2.5 syllables/second) in time to a metronome. Only tokens from the fast repetitions of each sequence were analyzed. Trial onset and the onset of fast repetitions was self-paced.

SoundLab, Northwestern University

61 participants; 52 female/9 male; each participant produces 44 nouns, 22 each in sentence context and bare naming; TextGrids are available for annotations from human coders as well as three automated systems


Return to the top