Auditory Stimuli: Toronto Word Pool (twp1000)
- Source: Toronto Word Pool, a standard psycholinguistic word set used across many memory experiments
- Count: 1,000 words × 4 speakers = 4,000 audio files
- Format: MP3
- Naming:
{word}_{voice}.mp3(e.g.,cabin_shimmer.mp3) - Generation: Text-to-speech via OpenAI API (GPT-4 Turbo model), generated November 2023
Speakers
| Voice ID | Gender | OpenAI TTS voice |
|---|---|---|
echo |
Male | Echo |
nova |
Female | Nova |
onyx |
Male | Onyx |
shimmer |
Female | Shimmer |
Word Metadata
Two CSV files provide psycholinguistic properties for each word:
twp1000.csv— the 1,000-word subset used in this studytwp_all.csv— the full Toronto Word Pool
| Column | Description |
|---|---|
itmno |
Item number (1-based index) |
word |
The word itself |
imagery |
Imagery rating (higher = more imageable) |
concreteness |
Concreteness rating (higher = more concrete) |
letters |
Number of letters |
frequency |
Word frequency (Kucera-Francis norms) |
foa |
First-order approximation to English, based on individual letter frequencies |
soa |
Second-order approximation to English, based on bigram frequencies |
onr |
Orthographic neighbor ratio (Landauer & Streeter, 1973): word frequency relative to combined frequency of orthographic neighbors |
dictcode |
Grammatical class code: 1=noun, 2=verb, 3=adjective, 4=adverb, 5=other |
noun |
Percent noun usage (0 or 100 for unambiguous words; rated for ambiguous words) |
canadian |
Flag for Canadian spelling variant (the final 13 entries in the full TWP are Canadian alternatives with duplicate item numbers) |
tmpno |
Temporary item number — present in twp1000.csv only |
The Toronto Word Pool (1,080 words) originated from Thorndike-Lorge (1944) norms and has been used in hundreds of memory experiments. See the WordPools R package for the original dataset and documentation.