Auditory Stimuli: Toronto Word Pool (twp1000)

Source: Toronto Word Pool, a standard psycholinguistic word set used across many memory experiments
Count: 1,000 words × 4 speakers = 4,000 audio files
Format: MP3
Naming: {word}_{voice}.mp3 (e.g., cabin_shimmer.mp3)
Generation: Text-to-speech via OpenAI API (GPT-4 Turbo model), generated November 2023

Speakers

Voice ID	Gender	OpenAI TTS voice
`echo`	Male	Echo
`nova`	Female	Nova
`onyx`	Male	Onyx
`shimmer`	Female	Shimmer

Word Metadata

Two CSV files provide psycholinguistic properties for each word:

twp1000.csv — the 1,000-word subset used in this study
twp_all.csv — the full Toronto Word Pool

Column	Description
`itmno`	Item number (1-based index)
`word`	The word itself
`imagery`	Imagery rating (higher = more imageable)
`concreteness`	Concreteness rating (higher = more concrete)
`letters`	Number of letters
`frequency`	Word frequency (Kucera-Francis norms)
`foa`	First-order approximation to English, based on individual letter frequencies
`soa`	Second-order approximation to English, based on bigram frequencies
`onr`	Orthographic neighbor ratio (Landauer & Streeter, 1973): word frequency relative to combined frequency of orthographic neighbors
`dictcode`	Grammatical class code: 1=noun, 2=verb, 3=adjective, 4=adverb, 5=other
`noun`	Percent noun usage (0 or 100 for unambiguous words; rated for ambiguous words)
`canadian`	Flag for Canadian spelling variant (the final 13 entries in the full TWP are Canadian alternatives with duplicate item numbers)
`tmpno`	Temporary item number — present in `twp1000.csv` only

The Toronto Word Pool (1,080 words) originated from Thorndike-Lorge (1944) norms and has been used in hundreds of memory experiments. See the WordPools R package for the original dataset and documentation.