Data Model Reference¶
This document covers the core Python types in src/models/__init__.py, the
EventData TypedDict that extractors use as their output contract, and the
type-safety guarantees enforced at pre-commit time.
For how events reach Cloudflare D1 see D1_DEPLOYMENT.md. For how extractors populate these fields see EXTRACTORS.md.
Table of Contents¶
- VegasEvent
- Event Identity: composite_key vs external_id
- Supporting Models
- EventData TypedDict
- Type Safety Guarantees
- Data Flow
Entity Relationships¶
erDiagram
VegasEvent ||--o| TablePricing : "has pricing"
VegasEvent ||--o{ ImageMetadata : "has images"
VegasEvent ||--o| StreamingLinks : "has links"
VegasEvent ||--o| ArtistStats : "has stats"
VegasEvent ||--o| EnrichmentStatus : "tracks enrichment"
VegasEvent ||--o| SocialLinks : "has socials"
TablePricing ||--|{ TablePricingTier : "has tiers"
VegasEvent {
string event_date
string performer
string venue
string url
string composite_key
string venue_id
string external_id
}
TablePricing {
list tiers
}
TablePricingTier {
string name
float min_spend
float pay_now
int guests
bool available
}
ImageMetadata {
string source_url
string category
dict sizes
string status
}
EnrichmentStatus {
bool spotify
bool tracklists
bool resident_advisor
dict errors
}
VegasEvent¶
File: src/models/__init__.py — class VegasEvent(BaseModel)
The canonical representation of a single nightlife event. All extractors produce
a VegasEvent; all exporters (D1, SQLite, CSV) consume one.
Fields¶
| Field | Type | Required | Description |
|---|---|---|---|
event_date |
str |
✅ | YYYY-MM-DD — validated by field_validator |
performer |
str |
✅ | Main headliner name (display form, e.g. "DOM DOLLA") |
venue |
str |
✅ | Venue name (e.g. "LIV Las Vegas") |
url |
str |
✅ | Source URL the event was scraped from |
scraped_at |
str |
✅ | ISO-8601 timestamp with timezone |
title |
str \| None |
Page title, e.g. "DOM DOLLA at LIV Las Vegas" |
|
event_time |
str \| None |
HH:MM local time |
|
event_datetime |
str \| None |
ISO-8601 datetime if available | |
venue_address |
str \| None |
Street address | |
age_requirement |
str |
Defaults to "21+ only" |
|
description |
str \| None |
Full event description | |
description_short |
str \| None |
Short teaser copy | |
artist_bio |
str \| None |
Artist biography (set by enrichment, not scraping) | |
artist_image_url |
str \| None |
R2 _main preset URL after upload; None until then |
|
artist_image_url_full |
str \| None |
R2 _hd preset URL after upload; None until then |
|
streaming_links |
StreamingLinks \| None |
Spotify / Apple Music / YouTube / SoundCloud | |
social_links |
SocialLinks \| None |
Instagram / TikTok / Beatport etc. | |
artist_stats |
ArtistStats \| None |
Spotify stats (set by enrichment) | |
enrichment_status |
EnrichmentStatus \| None |
Tracks which enrichment passes have run | |
performers |
list[str] \| None |
All performers including headliner + supporting acts | |
venue_description |
str \| None |
Venue blurb | |
venue_id |
str \| None |
Venue code for the table-pricing API (e.g. "VEN1121561") |
|
table_pricing |
TablePricing \| None |
VIP table sections and min-spend tiers | |
ticket_url |
str \| None |
Direct ticket purchase link | |
external_id |
str \| None |
Scraper-assigned ID (e.g. EVE111500020260531); not the D1 PK |
|
images |
list[Any] |
ImageMetadata objects (populated by image plugins) |
|
history |
EventHistory \| None |
Field-level change log; excluded from serialization |
Key computed property¶
@property
def composite_key(self) -> str:
perf = self.performer.lower().replace(" ", "-")
ven = self.venue.lower().replace(" ", "-")
return f"{self.event_date}-{perf}-{ven}"
# → "2026-05-02-dom-dolla-liv-las-vegas"
Event Identity: composite_key vs external_id¶
This distinction is the most common source of confusion. Read this section carefully before touching any exporter or migration.
external_id — unstable, informational¶
VegasEvent.external_id stores whatever ID the scraper found on the page, typically
the EVE... segment of a Wynn Social or LIV URL:
https://www.wynnsocial.com/event/EVE111500020260531/dom-dolla/
^^^^^^^^^^^^^^^^^^^^
external_id = "EVE111500020260531"
This value changes between scrapes. The same real-world event may have a
different EVE... string each time it is re-scraped. It is stored for debugging
and is written to the D1 events table as a non-PK column, but it is never used
as a join key.
composite_key — stable, the D1 primary key¶
composite_key = f"{event_date}-{performer_slug}-{venue_slug}"
# "2026-05-02-dom-dolla-liv-las-vegas"
This is derived entirely from human-meaningful fields that don't change between
scrapes of the same event. It is used as event_id (the PK) in every D1 table,
so all child rows (event_artists, table_tiers, images) join on it.
The D1 exporter enforces this with an assertion:
primary_event_id = event.composite_key
assert primary_event_id, f"composite_key is empty for event: {event.model_dump()}"
Why not just use external_id as the PK?¶
| Concern | composite_key | external_id |
|---|---|---|
| Stable across re-scrapes | ✅ Yes | ❌ No |
| Human-readable in dashboard | ✅ Yes | ⚠️ Opaque |
| Works without scraper-side IDs | ✅ Yes | ❌ Breaks on HTML-only pages |
| Derivable from site data alone | ✅ Yes | ❌ Only on JS-heavy pages |
Supporting Models¶
StreamingLinks¶
class StreamingLinks(BaseModel):
spotify: str | None = None
apple_music: str | None = None
youtube: str | None = None
soundcloud: str | None = None
Stored as streaming_links_json TEXT in D1 (JSON-encoded). Populated by the LIV
extractor (from description HTML) and by the Spotify enrichment plugin.
TablePricing / TablePricingTier¶
class TablePricingTier(BaseModel):
name: str # "Dance Floor", "VIP Booth", etc.
min_spend: float | None
pay_now: float | None # deposit amount
guests: int | None # table capacity
available: bool = True
class TablePricing(BaseModel):
tiers: list[TablePricingTier]
Tiers are fetched from the urvenue API (wp-admin/admin-ajax.php?action=uvpx&uvaction=uwspx_map)
using event.venue_id. See TABLE_PRICING.md.
ArtistStats / SocialLinks¶
Populated by enrichment plugins (Spotify, Resident Advisor), not by venue extractors. See PLUGIN_DEVELOPMENT.md.
EventData TypedDict¶
File: src/models/__init__.py — class EventData(_EventDataRequired, total=False)
EventData is the typed output contract for every venue extractor. Instead of
building an untyped dict and passing it to VegasEvent(**dict) with a # type:
ignore, extractors annotate their local dict as EventData and call the factory:
# In any extractor:
event_data: EventData = {
"url": url,
"scraped_at": scraped_at,
"performer": performer,
"venue": venue,
"event_date": event_date,
}
# ... conditionally add optional fields ...
if ticket_url:
event_data["ticket_url"] = ticket_url
return VegasEvent.from_extractor_data(event_data)
Required keys (must always be present)¶
Defined on _EventDataRequired(TypedDict):
| Key | Type |
|---|---|
url |
str |
scraped_at |
str |
performer |
str |
venue |
str |
event_date |
str |
Optional keys (total=False)¶
Defined on EventData(_EventDataRequired, total=False):
| Key | Type |
|---|---|
title |
str |
event_time |
str |
event_datetime |
str |
venue_address |
str |
description |
str |
description_short |
str |
artist_image_url |
str |
artist_image_url_full |
str |
streaming_links |
StreamingLinks |
ticket_url |
str |
age_requirement |
str |
performers |
list[str] |
images |
list[Any] |
external_id |
str |
venue_id |
str |
from_extractor_data factory¶
The single TypedDict → BaseModel boundary is here, not scattered across every
extractor. ty can fully validate the TypedDict at the call sites so no # type:
ignore is needed anywhere.
Type Safety Guarantees¶
Vinny uses ty (Astral's type checker) via
pre-commit, so these errors are caught before any code reaches main.
What ty catches at extractor call sites¶
| Mistake | ty error |
|---|---|
Misspelled field: "perfromer" |
unknown-key on EventData |
Old field name: "event_id" (renamed to external_id) |
unknown-key on EventData |
Wrong value type: event_data["performers"] = "DOM DOLLA" |
invalid-assignment — expected list[str] |
Missing required key: omitting "event_date" |
missing-key at VegasEvent.from_extractor_data(event_data) |
Passing str \| None for a required str field |
invalid-argument-type |
Pre-commit hook order¶
All four must pass for a commit to succeed. ty runs on the full src/ tree, not
just changed files — so a rename in models/__init__.py will immediately surface
stale references in all extractors.
Adding a new field to VegasEvent¶
- Add the field to
VegasEvent(with a default ofNoneor aField(default_factory=...)so existing events don't break) - If extractors need to supply it, add the key to
EventData(in_EventDataRequiredif always required, or inEventDatawithtotal=Falseif optional) - Run
just check— ty will surface any missed call sites
Data Flow¶
flowchart TD
A["Venue Page\n(HTML / JSON-LD)"] --> B["VenueExtractor.extract()\nbuilds EventData dict"]
B --> C["VegasEvent.from_extractor_data()\nsingle typed boundary"]
C --> D["VegasEvent"]
E["Enrichment Plugins\nartist_stats, streaming_links,\nsocial_links, artist_bio"] --> D
D --> F["MasterDatabase.add_or_update_event()\nmerge + track field history"]
F --> G["D1Exporter"]
F --> H["SQLiteExporter"]
G --> I["Cloudflare D1\n(remote production)"]
H --> J["Local events.db\n(dev / archive)"]
style A fill:#7c3aed,color:#fff
style D fill:#059669,color:#fff
style I fill:#d97706,color:#fff
style J fill:#d97706,color:#fff
Key points:
- Extractors never touch VegasEvent directly — only through EventData + factory
- Enrichment plugins receive a VegasEvent and return a new one (immutable pattern)
- Exporters are read-only consumers — they never modify events
- composite_key is computed on-the-fly from VegasEvent fields; it is never
stored in the Python object, only materialised when writing to D1/SQLite
Last updated: 2026-03-04