Visual Stimuli: Shared1000
- Source: Natural Scenes Dataset (NSD) shared images (see NSD Experiments documentation)
- Count: 1,000 images (in
images/subdirectory) - Format: PNG
- Naming:
shared{NNNN}_nsd{NNNNN}.png—sharedindex (1-based, zero-padded to 4 digits) + original NSD image ID - Selection: These are the 1,000 images designated as “shared” in the NSD experiment (shown to all NSD participants). Selection criteria and image properties are documented by the NSD group.
- Licensing: TBD — follows NSD data sharing terms
Image Metadata
nsd_stim_info.csv contains metadata for the full NSD stimulus set (73,000
images). Rows where shared1000=True correspond to the images used in this
study.
| Column | Description |
|---|---|
| (index) | 0-based row index |
cocoId |
MS-COCO image ID |
cocoSplit |
MS-COCO dataset split (e.g., val2017) |
cropBox |
Crop coordinates used to extract the stimulus from the original image |
loss |
NSD loss metric for the image |
nsdId |
NSD image ID — matches the _nsd{NNNNN} portion of the PNG filename |
flagged |
Whether the image was flagged in NSD |
BOLD5000 |
Whether the image appears in the BOLD5000 dataset |
shared1000 |
True for images in this study’s stimulus set |
COCO Annotations
All 1,000 shared images originate from the MS-COCO train2017 split. COCO
annotations (captions and object instances) have been extracted for these images
and stored in two CSV files.
coco_annotations.csv — one row per image (1,000 rows):
| Column | Description |
|---|---|
nsdId |
NSD image ID (links to filenames and nsd_stim_info.csv) |
cocoId |
MS-COCO image ID |
caption_1 through caption_5 |
Five human-written captions from COCO |
object_categories |
Semicolon-separated list of COCO object categories detected in the image |
object_counts |
Category:count pairs (e.g., person:3; dog:1) |
supercategories |
Semicolon-separated COCO supercategories (e.g., animal; person) |
n_object_instances |
Total number of annotated object instances |
n_unique_categories |
Number of distinct object categories |
coco_captions.csv — one row per caption (5,000+ rows):
| Column | Description |
|---|---|
nsdId |
NSD image ID |
cocoId |
MS-COCO image ID |
caption_index |
Caption number (1–5) |
caption |
The caption text |
These files were generated by scripts/extract_coco_metadata.py from the
official COCO annotations_trainval2017 release. The 80 COCO object categories
span supercategories including person, vehicle, outdoor, animal, accessory,
sports, kitchen, food, furniture, electronic, appliance, and indoor.
Computational Features (viz2psy)
All 1,000 shared images are processed with viz2psy, producing a single consolidated output:
viz2psy_scores.csv— one row per image (~2,900 columns), indexed byfilenameviz2psy_scores.meta.json— feature definitions, model versions, and provenance
Features extracted per image include:
| Model | Columns | Description |
|---|---|---|
| resmem | 1 | Image memorability score (0–1) |
| emonet | 20 | Emotion category probabilities (e.g., Awe, Joy, Fear) |
| clip | 512 | CLIP vision-language embeddings (L2-normalized) |
| dinov2 | 768 | DINOv2 self-supervised visual features |
| gist | 512 | Gabor-based spatial envelope descriptors |
| places | 467 | Scene category probabilities (365) + SUN attributes (102) |
| llstat | 17 | Low-level statistics (luminance, color, edges, spatial frequency) |
| caption | 1 | Natural language image description (BLIP) |
| saliency | 576 | Predicted fixation density on a 24x24 spatial grid |
| aesthetics | 1 | Aesthetic quality rating (1–10) |
| yolo | 85 | Object detection counts (80 COCO classes) + summary stats |
See the viz2psy documentation for full
column definitions, or consult the .meta.json sidecar.