A digital historical archive platform that tracks international incidents by combining factual institutional sources with oral history from social media. First case study: Iran's Woman, Life, Freedom movement.
March 2026|Research Phase|v0.1 Spike
4
Source Tiers
10
Case Events
15+
APIs Evaluated
5
Pipeline Stages
What This Spike Proves
Chronicle's methodology is portable, technically feasible, and addresses a real gap in digital archiving. The Iran case study demonstrates that systematic capture and comparison of divergent narratives produces historical understanding that no single source type achieves alone. An MVP is buildable in 8-12 weeks at under $200/month infrastructure cost.
The Core Problem
Historical truth is not singular. Every international incident generates multiple, often contradictory narratives across institutional media, local reporting, government statements, and citizen testimony. Traditional archives privilege institutional sources. Social media captures what institutional media cannot: the lived experience of events, the immediate emotional register, the details that editorial processes filter out.
Chronicle treats social media and citizen journalism as first-class historical data, subject to rigorous but distinct verification protocols. A shaky mobile phone video from a Tehran protest carries evidentiary weight that no wire service summary can replicate.
01 / Methodology
Source Classification and Verification
Chronicle uses a four-tier source taxonomy that ranks sources not by inherent trustworthiness, but by institutional accountability and editorial oversight. A Tier 4 source (citizen video) may capture ground truth that Tier 1 sources miss entirely. The tier system determines verification protocols, not value.
Tier 1: Institutional RecordTier 1
Wire services (Reuters, AP, AFP), official government statements, UN reports and resolutions, ICJ rulings, treaty texts.
Cross-reference across 2+ independent sources
Tier 2: Established JournalismTier 2
Major international outlets (BBC, Al Jazeera, NYT), established investigative organizations (Bellingcat, OCCRP), specialist publications.
Document outlet, byline, sourcing methodology
Tier 3: Regional / IndependentTier 3
Local news outlets, independent journalist platforms, specialist blogs, diaspora media (Iran International, Manoto TV).
Verify track record, funding model, independence
Tier 4: Citizen TestimonyTier 4
Social media posts (Twitter/X, Telegram, TikTok), citizen journalism, oral history interviews, community forum discussions.
Each archived claim receives a corroboration score based on source independence, diversity, temporal consistency, and methodological transparency.
Level
Criteria
Meaning
Confirmed
3+ independent sources across 2+ tiers
High confidence, multiple corroboration
Probable
2 independent sources, or multiple within one tier with transparency
Likely accurate, further corroboration welcome
Reported
Single-source or same-tier-only corroboration
Documented but flagged for verification
Contested
Sources actively contradict each other
Contradictions become part of the record
Named after Akira Kurosawa's exploration of subjective truth in conflicting testimony, this protocol governs how Chronicle presents contradictory accounts:
Present all accounts with full attribution and tier classification.
Document the specific points of agreement and disagreement.
Provide the corroboration matrix for each claim.
Note structural reasons for divergence (government interest in minimizing casualties, opposition interest in maximizing them).
Never collapse contradictory accounts into a single "true" narrative unless the evidence overwhelmingly supports one account.
Contradictions are not failures of the archive. They are primary data.
Platform algorithms do not neutrally distribute content. Chronicle tracks five distortion vectors:
Virality bias: emotionally provocative content amplified over nuanced testimony
Language bias: English-language content receives disproportionate algorithmic promotion
Platform-specific selection: each platform's architecture favors certain content types
Temporal compression: complex, multi-day events compressed into single viral moments
Bot and state-actor amplification: coordinated inauthentic behavior inflates certain narratives
Chronicle does not attempt to reconstruct an "undistorted" information landscape. Instead, it documents engagement metrics, flags coordinated amplification, weights source diversity over engagement volume, and preserves low-engagement content with equivalent archival priority.
Chronicle builds on established work across digital humanities, computational journalism, and crisis informatics:
Kate Starbird (University of Washington): crisis informatics, misinformation propagation
Syrian Archive / Mnemonic: systematic social media collection for human rights documentation
Internet Archive: web-scale content preservation, timestamping, versioning
USC Shoah Foundation: systematic personal testimony collection at scale
September 11 Digital Archive (GMU): born-digital materials as primary historical sources
Stanford Internet Observatory: state-sponsored information operations, Iranian influence campaigns
The overriding ethical obligation: the archive must not become a tool of the repression it documents. Default to source protection over archival completeness. A gap in the record is recoverable; a source endangered by the archive is not.
02 / Technical Architecture
Data Pipeline and Infrastructure
Chronicle's architecture follows five stages: Ingest, Classify, Verify, Store, Visualise. Each stage is independently scalable and failure-tolerant.
01
Ingest
Kafka/Redis message queue, batch + real-time, raw data lake
Key design decisions: source tiers (1-4) encode reliability directly. Array columns for source_ids avoid excessive join tables. Generated tsvector columns enable weighted full-text search with 'A' weight for titles and 'B' weight for summaries.
Infrastructure Cost Comparison
Component
Self-Hosted
Cloud (AWS)
Compute
$50-100/mo (Hetzner)
$150-300/mo (EC2)
Database
Included (PostgreSQL)
$50-100/mo (RDS)
Object Storage
$5/mo (MinIO)
$15-23/mo per TB
Search
Included (Meilisearch)
$80+/mo (OpenSearch)
Message Queue
Included (Redis)
$50+/mo (ElastiCache)
Total
$50-150/mo
$350-600/mo
03 / UX Concepts
Information Architecture and Reader Journeys
Chronicle serves four distinct personas, each with different expertise and tolerance for complexity. The archive must serve all four without forcing any into a workflow designed for another.
Entry: Direct search for a specific incident, or timeline browsing around a known date.
Pain points: Archives without clear chronological ordering. No way to see what changed between early and late reporting.
General Public / Students
Needs: context, guided narrative, accessible language. Visual timeline exploration without needing to understand source taxonomy.
Entry: Social media link, news article embed, search engine result.
Pain points: Archives that assume domain expertise. No clear starting point for unfamiliar users.
Human Rights Investigator
Needs: evidence chains, geolocation data, chain-of-custody documentation, structured exports (JSON, CSV) for integration with investigation tools.
Entry: Direct navigation or API query from existing investigation tools.
Pain points: Archives without original metadata. No chain-of-custody. Export formats incompatible with investigative toolchains.
Four Navigation Modes
Mode
Mental Model
Best For
By Timeline
"What happened when?"
Journalists, general readers
By Event
"Tell me about this specific incident"
Researchers, investigators
By Source Type
"What did [source category] report?"
Academics, media analysts
By Narrative Theme
"How did this story evolve?"
Long-form readers, students
The timeline is the primary navigation surface, following Shneiderman's mantra: "overview first, zoom and filter, then details on demand."
Zoom Levels
Level
Granularity
Best For
Year
One row per month, density only
Seeing the full arc
Month
One row per week, collapsed cards
Identifying busy periods
Week
One row per day, expanded cards
Reading event summaries
Day
Hourly axis, full detail
Reconstructing a single day
Additional patterns: parallel timelines for comparing source categories side-by-side, branching timelines for events that spawn multiple narrative threads, and density indicators (sparklines) showing source volume without requiring individual event inspection.
All interactive elements are keyboard accessible following W3C WAI-ARIA practices. Both modes meet WCAG 2.1 AA contrast requirements. Source tier indicators use both color and text labels, never color alone.
Mobile follows thumb-zone navigation principles (Hoober, 2013): critical controls at screen bottom, full-width card layout, progressive disclosure through three depth levels (summary, detail, full source).
Offline support via Service Worker caching for researchers in low-connectivity environments, following the Progressive Web App patterns established by the Financial Times.
04 / Visual Design
Design System and Components
The visual language is built on three references: Linear (clean surfaces, subtle borders, focused interactions), Vercel (precise typography, generous whitespace, monospace accents), and NYT interactive features (layered information reveal, scroll-driven narrative).
Source Tier Visual Language
The credibility gradient is the most important visual system. Colors follow a traffic light metaphor: green (verified) to red (scrutinize). This is a methodological transparency tool, not a value judgment on oral history.
Clean UI font paired with data-appropriate monospace
Vercel docs
4px spacing base
Tight for data density, flexible for generous layouts
Linear spacing
Green-to-red tier colors
Traffic light metaphor is universally understood
Academic convention
Vertical timeline
Natural for scrolling interfaces, chronological reading
NYT interactives
Split-panel comparison
Direct juxtaposition more effective than tabs for analysis
Academic comparison tools
Component Samples
Below are live rendered components from the Chronicle design system, demonstrating how event cards, source cards, and oral history elements appear in context.
Event Card
2022-09-16
Death of Mahsa Amini in Morality Police Custody
Tehran, Iran
47 sources
3 narratives
Oral History Element
"They are shooting at us from the rooftops. My neighbor was hit. We cannot leave."
Shared by @anonymous, Sep 21 2022 via Twitter (now deleted) Geolocation confirmed (Sanandaj)
Corroborated by 2 independent sources Archived Sep 21 2022 via Internet Archive
05 / Case Study
Iran: Why This Context
Iran serves as Chronicle's first case study because it presents every challenge a digital historical archive must solve, concentrated in a single context. The country generates a dense, multilingual, multi-platform information ecosystem where state media, international press, diaspora outlets, and citizen journalists produce fundamentally divergent accounts of the same events.
If the Chronicle methodology works for Iran, it works anywhere.
Government internet shutdowns erase digital evidence in real time. Platform algorithms amplify certain narratives while burying others. State-sponsored disinformation campaigns operate alongside genuine grassroots testimony.
Three Intersecting Crises (2022-2025)
Domestic Legitimacy Crisis
The Woman, Life, Freedom movement: 500+ killed (Iran Human Rights), 22,000+ arrested (Amnesty International). The most sustained anti-government protests since 1979. By mid-2023, street protests diminished under repression, but underlying grievances continued in labor strikes and civil disobedience.
Nuclear Standoff
Uranium enrichment reached 84% purity (near weapons-grade) by early 2023, confirmed by IAEA. JCPOA revival talks stalled. Monitoring cameras decommissioned June 2022. The nuclear question became inseparable from regional security.
Regional Proxy Conflicts
Iran's support for Hamas, Hezbollah, Houthis, and Iraqi militias placed Tehran at the center of multiple conflicts. Culminated in Iran's unprecedented direct missile and drone attack on Israel in April 2024, the first open military strike between the two states.
Source Density: September 2022 Peak
Sep 2022
312 src
Oct 2022
248 src
Nov 2022
189 src
Dec 2022
134 src
Jan 2023
98 src
Feb 2023
76 src
06 / Iran Timeline
Key Events
Ten documented events demonstrating how Chronicle maps incidents across source tiers, identifies narrative divergences, and documents engagement distortions.
2022-09-16
Death of Mahsa (Jina) Amini
22-year-old Kurdish-Iranian woman died in Kasra Hospital after detention by morality police. Her death triggered the Woman, Life, Freedom movement, the largest anti-government protests since 1979.
humanitarian47 sourcesConfirmed
● Official Account
State TV (IRIB): "Ms. Amini suffered a heart attack at the guidance patrol station. She had pre-existing heart conditions."
Discrepancy
Cause of death: "heart failure"
● Oral History / Social
"She was beaten in the van. I saw bruises on her when they carried her out. No heart attack. This is a lie."
Corroborated by 14 independent accounts on Twitter and Telegram
Discrepancy
Cause of death: "blunt force trauma to the head"
Engagement distortion: Initial Farsi-language testimony received modest engagement (hundreds). The story reached global virality only after English-language diaspora accounts reframed it. The most-preserved testimony is English-language commentary, not the Farsi-language local testimony that documented events first.
2022-09-30
Bloody Friday in Zahedan
Security forces opened fire on protesters at Makki Mosque. At least 96 killed (Iran Human Rights), making it the single deadliest day of the 2022-2023 protests. Received dramatically less international coverage than Tehran protests despite higher death toll.
military_actionprotestConfirmed
Narrative divergence (scope contradiction): State media reported a "terrorist attack on a police station." International media reported a massacre of unarmed protesters. The actual event involved both elements: a separate police station attack was used to justify the lethal crackdown on the mosque protest.
Engagement distortion: Geographic bias (no bureau presence), ethnic marginalization (Baloch communities underrepresented in diaspora), platform infrastructure (more complete internet shutdowns), and algorithmic suppression (graphic content removed).
2022-09-21
Internet Shutdowns Begin
Near-total internet shutdowns documented by NetBlocks, Cloudflare Radar, and OONI. Mobile data cut nationwide, fixed-line throttled. The most direct threat to Chronicle's methodology: when the state eliminates Tier 4 sources in real time.
policyhumanitarianConfirmed
Archival significance: Internet shutdowns skew the historical record toward state-controlled Tier 1 sources that continued operating domestically. Chronicle must capture content continuously (not retrospectively), handle latency-tolerant ingest for queued messages, and document gaps as data points rather than neutral absences.
2022-12-08
Execution of Mohsen Shekari
First known execution of a protester. Convicted of "moharebeh" (enmity against God) in proceedings Amnesty International documented as a "sham trial": no legal counsel, forced confessions, under one hour.
legalhumanitarianContested
Narrative divergence: State framed Shekari as a violent criminal convicted through due process. Opposition narrative, supported by international legal observers, documented systematic fair trial violations. Chronicle classifies "due process" as Disputed: UN reports, Amnesty documentation, and legal analysis contradict the claim.
2023-2024
Expansion of Morality Police Enforcement
Return of morality police patrols (July 2023), AI-powered surveillance cameras for hijab detection, more restrictive legislation (September 2023). Iranian women created a citizen counter-surveillance network via Telegram.
policyongoingConfirmed
Engagement distortion (temporal compression): Platform algorithms favor discrete, dramatic events over slow-burn developments. The morality police's return was gradual, months-long escalation. Sporadic viral moments (a woman confronting an officer) received attention; the systematic pattern was legible only through aggregation. Chronicle's multi-zoom timeline is specifically designed to make slow-burn patterns visible.
2024-03-01
Contested Parliamentary Elections
41% turnout, the lowest in the Islamic Republic's history. Guardian Council disqualified the majority of reformist candidates. Social media documented both crowded and empty polling stations, often filming the same stations at different times of day.
electionContested
The turnout dispute encapsulates the entire case study: opposition analysts argue actual turnout was lower (citing blank and voided ballots included in the count), while state media emphasized that 41% still represented "tens of millions." Chronicle archives the official figure, the contested analyses, and social media documentation of both crowded and empty stations.
2024-04-13
Iran's Missile and Drone Attack on Israel
"Operation True Promise": 300+ drones, cruise missiles, and ballistic missiles. First open military strike between the two states. Satellite imagery showed some impact craters at Nevatim airbase. Neither "total success" nor "total failure" is accurate.
military_actiondiplomaticContested
Engagement distortion (real-time virality outpacing verification): Early viral posts included recycled Syria footage, video game clips presented as real interceptions, and AI-generated imagery. Twitter/X Community Notes flagged some misinformation but could not keep pace. The most viral content was often the least accurate.
Confirmed: Iran communicated timing through diplomatic intermediaries, suggesting a "managed escalation" designed to demonstrate capability without triggering full-scale war.
2024-05-19
Death of President Raisi in Helicopter Crash
President Raisi's helicopter crashed near Azerbaijan border in fog. Three competing narratives: mechanical failure (most supported by Tier 1-2), deliberate sabotage, or pilot error compounded by weather. Public reaction was genuinely split: state media showed massive funerals, social media documented celebrations.
eventdiplomaticProbable
TikTok's algorithm created divergent content feeds based on user engagement patterns: Iranian users and diaspora users saw fundamentally different representations of public reaction. Chronicle documents both without adjudicating which was "more representative."
2023-2024
Labor Protests and Teacher Strikes
Recurring strikes across Tehran, Isfahan, Ahvaz, Tabriz. ILO cited persistent violations. Teachers' union leaders imprisoned. Received a fraction of social media engagement compared to Mahsa Amini protests despite affecting more people over longer periods.
protesthumanitarianConfirmed
Engagement distortion (narrative-framing bias): The Amini protests were framed as women's rights and freedom, which resonated with Western audiences and algorithms. Labor protests, framed around economic grievances and union rights, received far less engagement. This reflects the structural preferences of global platforms for certain narrative frames.
2023-2024
Houthi Attacks on Red Sea Shipping
Iran-linked Houthi attacks on commercial shipping. UN Panel of Experts documented Iranian weapons transfers. The Iran connection illustrates interpretive contradiction: all agree Houthis conduct attacks; sources diverge on Iran's operational role.
military_actiongeopoliticalConfirmed
Four framings documented: US/UK (Iranian proxy), Iran (independent Houthi decisions), Houthi leadership (autonomous, Gaza solidarity), and academic analysis (more complex than "proxy"). Chronicle documents all four, noting the evidentiary basis for each.
07 / Analysis
Cross-Event Engagement Distortion
Comparing engagement patterns across all ten events reveals systematic distortions that any Iran-focused archive must account for.
Narrative Comparison: Mahsa Amini
This demonstrates how Chronicle renders divergent accounts of the same event, applying the Rashomon Protocol.
● Official Account (Tier 1-2)
State TV (IRIB): "Ms. Amini suffered a heart attack at the guidance patrol station. She had pre-existing heart conditions."
IRNA, Fars News confirm official statement. Medical report released by coroner's office (2022-10-07) concludes "heart failure."
Key claim
Cause of death: pre-existing heart condition. No evidence of physical abuse during custody.
Assessment: State produced no independent medical evidence. No access granted for independent investigation.
● Citizen Accounts (Tier 3-4)
Family testimony (BBC Persian): "She had no pre-existing conditions." Amini's cousin posted video from outside Kasra Hospital. Initially 342 likes; later 45,200.
Cause of death: blunt force trauma. Beaten in custody van. Delayed medical attention (2-hour gap documented).
Assessment:Probable Leaked CT scan consistent with head trauma. Multiple independent witness accounts. Family testimony. Hospital admission timeline confirmed by Tier 1 sources.
Systematic Distortion Patterns
Events generating English-language viral content (Amini's death, Israel strike, Raisi's crash) received 10-100x more global engagement than events documented primarily in Farsi (labor strikes, Zahedan massacre, morality police enforcement). Chronicle's correction: weight source diversity and language diversity equally.
Tehran-centered events dominate because of international bureau presence, internet infrastructure, and diaspora connections. Events in peripheral provinces (Sistan-Baluchestan, Khuzestan, Kurdistan) are systematically under-documented despite often involving greater violence. Chronicle penalizes event records that lack provincial testimony.
Platform algorithms compress sustained developments into discrete viral moments. The four-level timeline zoom is designed to counteract this: sustained developments are archived as linked event chains, not isolated incidents.
Content removals by platforms create gaps that mirror state censorship. Chronicle archives platform moderation actions as events. A removed TikTok video is logged with removal date, stated reason, and cached content.
Replicability: Iran Validates the Methodology
The Iran case study validates core design choices:
The four-tier taxonomy maps cleanly to Iran's media ecosystem and would map with equal clarity to Sudan, Ukraine, or any other context.
The engagement distortion framework identified Iran-specific patterns (Farsi vs. English, Tehran-centrism) with direct parallels elsewhere.
The Rashomon Protocol proved essential: every major event generated fundamentally divergent narratives.
Source protection protocols were stress-tested against Iran's specific threat model and need contextual adaptation, not methodological redesign.
08 / Feasibility
Build Assessment
Buildable Now
Core Archive
PostgreSQL + Static Site
Schema, timeline visualization, source linking. Straightforward web development with mature tools.
4-8 weeks, solo developer
Data Integration
GDELT + ACLED Ingest
Both offer well-documented, free APIs with structured data. Ingest pipelines buildable in days.
1-2 weeks
Search
PostgreSQL Full-Text
Covers 80% of search needs without additional infrastructure. Multilingual with custom dictionaries.
1-2 weeks
Needs Development
Highest Effort
Social Media Ingest
Twitter/X API costs are high ($5K/mo), platform reliability declining. Telegram scraping requires careful engineering. This is the highest-effort component.
4-6 weeks + ongoing maintenance
NLP Challenge
Persian Text Pipeline
Off-the-shelf NER has limited Persian accuracy. Fine-tuning requires labeled training data that does not exist yet.
6-8 weeks including data labeling
Core Value Prop
Verification Pipeline
Automated cross-referencing and confidence scoring. Requires temporal reasoning, independence checking, and disinformation handling.
4-6 weeks for V1, ongoing refinement
Technical Risks
API instability: Twitter/X pricing has changed repeatedly. Mitigation: treat as one input among many, archive aggressively.
Persian NLP accuracy: low-resource language processing. Mitigation: human-in-the-loop verification for Persian content.
Data volume surges: conflict events can generate millions of posts in weeks. Mitigation: Kafka/Redis buffering with raw data lake as safety net.
Legal and ethical risks: consent, source safety, data retention. Mitigation: anonymization features, consultation with digital rights organizations (EFF, Access Now).