Full Collection Results

All 294 documents — the complete North American Slave Narratives collection

About This Run

This page presents results from running the full BERTopic pipeline on the entire collection. Comparing it against the two 10% samples shows which themes are stable across the full corpus and which emerged only in specific subsets of documents.

To reproduce this run, use the same script and setup from the Hands-On Exercise, but add --sample-frac 1.0 to the command. No changes to the script code are needed — it is a command-line flag. For example:

python -u scripts/run_bertopic_sample.py \
  --sample-frac 1.0 \
  --output-dir outputs/my_run_full \
  --embedding-backend ollama \
  --ollama-embedding-model nomic-embed-text \
  --representation-backend ctfidf \
  --clustering sensitive \
  --label-backend ollama \
  --ollama-model llama3.1:latest
Sample fraction: 100%
Documents:       294
Chunks:          16,729
Topics found:    439

All CSV files and HTML visualizations for this run are in the repository under:

outputs/bertopic_full_nomic_sensitive_lemmatized/
  topic_review_table.csv
  topic_labels_llm.csv
  topic_assignments.csv    ← local only, too large to host on the site
  topic_info.csv
  visualizations/          ← 4 BERTopic charts (open in browser)
  metadata_visualizations/  ← 5 metadata charts (open in browser)

With 294 documents, the model finds far more topics than any single 10% sample — including many narrow topics tied to specific documents or authors. Some topics are broad recurring themes; others are very specific to a single narrative. The same note about similar labels applies: multiple topics with similar names (such as several “Whipping and Plantation Punishment” clusters) are distinct groups that the model kept separate based on differences in vocabulary and context.

Some topic labels in this run appear as a list of raw words rather than a descriptive phrase — for example, Ship, Sail, Fleet, Captain or Camel, Lion, Beast, Tree. This happens when the local language model fails to generate a valid label for a cluster, usually because the cluster is very small, highly specific to one document, or contains unusual vocabulary. When that happens, the script falls back to listing the top four topic words as the label. About 45 of the 439 topics have this fallback label. These clusters are still valid groupings — the top words and the passages assigned to them can be read in topic_assignments.csv to understand what the cluster contains. A human reader can assign a better label after inspecting the passages.

Top 20 Topics by Chunk Count

Topic Label Chunks
0 Spiritual Resilience in the Face of Death 137
1 Critique of Slaveholder’s Use of Religion 113
2 Racial Identity and Prejudice 101
3 Faith and Redemption 95
4 Ministerial Encounters and Book Sales 93
5 Respectability and Social Standing 92
6 Authorial Intentions and Historical Context 92
7 Convict Labor and Punishment in Prisons 87
8 Whipping and Plantation Punishment 85
9 Conversion to Ministry 85
10 African American Preachers in Antebellum Era 83
11 Encounters with Rebel Soldiers 77
12 Education and Institutional Development 76
13 Whipping and Plantation Punishment 76
14 Sancho Letters and Petitions 75
15 Limited Rights and Protections for Slaves 74
16 Plantation Life and Labor 73
17 Forced Sale of Family Members 72
18 Encounters with Slave Traders 70
19 The Incompatibility of Liberty and Slavery 68

Top Topics in This Run

Some topic labels from the full collection — compare against the first sample and second sample to see which themes appear across multiple runs and which are unique to specific documents. Download the full CSV for all 439 topics.

  • Spiritual Resilience in the Face of Death
  • Critique of Slaveholder’s Use of Religion
  • Racial Identity and Prejudice
  • Faith and Redemption
  • Ministerial Encounters and Book Sales
  • Respectability and Social Standing
  • Authorial Intentions and Historical Context
  • Convict Labor and Punishment in Prisons
  • Whipping and Plantation Punishment
  • Conversion to Ministry
  • African American Preachers in Antebellum Era
  • Encounters with Rebel Soldiers
  • Education and Institutional Development
  • Whipping and Plantation Punishment
  • Sancho Letters and Petitions
  • Limited Rights and Protections for Slaves
  • Plantation Life and Labor
  • Forced Sale of Family Members
  • Encounters with Slave Traders
  • The Incompatibility of Liberty and Slavery

Download the Results

Download topic review table Download topic labels Download topic info

topic_assignments.csv for the full collection is too large to host on the site (over 100 MB). It is available locally after running the pipeline — see the folder path above.

Visualizations

Topic Shares by Publication Decade

Topic-by-Decade Heatmap

Document-by-Topic Heatmap

Sample Documents Timeline

Topic Prevalence Over Time (Area Chart)

Topic Word Bars

Topic Hierarchy

Topic Cluster Map