Data Structures

The statistical foundation is built on specific data structures that capture community opinion as raw distributions rather than pre-computed summaries.

Player Count Ratings

The player count rating is the most important statistical data structure in the specification. For each game and each supported player count, it records a numeric community rating on a 1-5 scale – not categorical buckets, but real numbers that support standard statistical analysis. See Player Count Model for the full design rationale.

Schema

Field	Type	Description
`game_id`	UUIDv7	The game being evaluated
`player_count`	integer	The specific player count (1, 2, 3, …)
`average_rating`	float (1.0-5.0)	Mean community rating at this player count
`rating_count`	integer	Number of votes at this player count
`rating_stddev`	float	Standard deviation (consensus vs polarization)

Example: Lost Ruins of Arnak

Player Count	Avg Rating	Votes	Std Dev	Signal
1	3.4 / 5	876	1.0	Decent solo – AI opponent works but lacks tension
2	4.5 / 5	1,234	0.6	Strong consensus – the sweet spot
3	4.2 / 5	1,089	0.7	Great, slightly more downtime than 2
4	3.1 / 5	745	1.1	Acceptable but downtime becomes noticeable

From this raw data, a consumer can derive:

Highest rated at 2 (4.5/5 with tight consensus at std dev 0.6).
Well-rated at 2-3: Both above 4.0, the “highly rated” threshold.
4 is acceptable but divisive (3.1/5 with high std dev) – the added downtime between turns divides opinion.
Solo is middling (3.4/5) – the AI opponent is functional but lacks the competitive tension of multiplayer.

Different applications set different thresholds. A hardcore strategy app might set “recommended” at 4.0+; a family app might set it at 3.0+. The raw numeric data supports any interpretation – no fixed categories constrain the analysis.

Accessing Rating Data

GET /games/lost-ruins-of-arnak/player-count-ratings

{
  "game_id": "01967b3c-5a00-7000-8000-000000000095",
  "ratings": [
    { "player_count": 1, "average_rating": 3.4, "rating_count": 876, "rating_stddev": 1.0 },
    { "player_count": 2, "average_rating": 4.5, "rating_count": 1234, "rating_stddev": 0.6 },
    { "player_count": 3, "average_rating": 4.2, "rating_count": 1089, "rating_stddev": 0.7 },
    { "player_count": 4, "average_rating": 3.1, "rating_count": 745, "rating_stddev": 1.1 }
  ]
}

BGG Legacy Data

For data migrated from BoardGameGeek, the PlayerCountPollLegacy schema preserves BGG’s three-tier voting model (Best / Recommended / Not Recommended). This data is available via the API but is not the native model – it is maintained for backward compatibility and transparency during migration. Legacy three-tier data can be converted to approximate numeric ratings for unified querying. See Player Count Model: BGG Legacy Data.

Weight Votes

The weight field on a Game is an average. The weight vote distribution provides the underlying data.

Schema

Field	Type	Description
`game_id`	UUIDv7	The game being evaluated
`votes`	object	Map of weight value (string) to vote count
`total_votes`	integer	Sum of all votes
`average`	float	Computed average (same as Game.weight)

Example: Great Western Trail

Great Western Trail’s weight is concentrated in the 3.5-4.0 range, reflecting strong agreement that the game sits firmly in the “heavy-medium” band. The tight cluster suggests voters – despite varying experience levels – perceive the game’s complexity similarly. The small tail of 1.0-2.0 votes may reflect voters who found the cattle-market loop more intuitive than the weight suggests.

A bimodal distribution (many 2.0 votes and many 4.0 votes) would suggest the game’s complexity is debated, which is useful information that an average hides.

Dimensional Weight Data

Implementations that support the detailed weight mode store per-dimension vote distributions – rules complexity, strategic depth, decision density, cognitive load, fiddliness, and game length – each rated independently on a 1-5 scale. These per-dimension distributions are exportable alongside the composite weight distribution, enabling analyses like “which games have high strategic depth but low fiddliness?” that a single composite number cannot answer.

Accessing Weight Distribution

GET /games/great-western-trail/weight-votes

{
  "game_id": "01967b3c-5a00-7000-8000-000000000090",
  "votes": {
    "1.0": 5,
    "1.5": 3,
    "2.0": 18,
    "2.5": 67,
    "3.0": 312,
    "3.5": 1489,
    "4.0": 2134,
    "4.5": 876,
    "5.0": 142
  },
  "total_votes": 5046,
  "average": 3.72
}

Rating Distribution

The average_rating on a Game is a single number that hides the distribution shape. The rating distribution exposes the full histogram of voter opinion, a confidence score, and standard deviation. See Rating Model for the four-layer rating architecture.

Schema

Field	Type	Description
`game_id`	UUIDv7	The game being evaluated
`average_rating`	float (1-10)	Arithmetic mean of normalized ratings
`rating_count`	integer	Total number of ratings
`rating_distribution`	integer[10]	Histogram: vote count at each 1-10 bucket
`rating_stddev`	float	Standard deviation of the distribution
`confidence`	float (0.0-1.0)	Spec-defined confidence score

Example: Dune: Imperium

The distribution reveals what the average hides:

Bell curve centered at 8-9 – strong consensus that this is a top-tier game.
Low std dev (1.38) – voters agree. Compare to a brigaded game where std dev exceeds 3.5.
High confidence (0.86) – large sample, tight consensus. This number is trustworthy.

A bimodal distribution (peaks at 3 and 9) would indicate a polarizing game – some love it, some hate it. The average might be 6.0 in both cases, but the distribution shape tells a completely different story.

The confidence score (0.0-1.0) synthesizes sample size, distribution shape, and deviation from the global mean into a single trust signal. See Rating Model: Confidence Score for the formula, and the pre-release brigading case study for a real-world example where confidence correctly flags a meaningless rating.

Accessing Rating Distribution

GET /games/dune-imperium/rating-distribution

{
  "game_id": "01967b3c-5a00-7000-8000-000000000091",
  "average_rating": 8.32,
  "rating_count": 42876,
  "rating_distribution": [98, 112, 245, 502, 1234, 3456, 8912, 14567, 10234, 3516],
  "rating_stddev": 1.38,
  "confidence": 0.86
}

Community Age Poll

The community age poll captures voter recommendations for the minimum appropriate age for a game. Unlike player count ratings (which use a numeric scale), age polls are simple: voters pick the minimum age they would suggest. The distribution reveals how the community’s view compares to the publisher’s box rating. See Age Recommendation Model.

Schema

Field	Type	Description
`game_id`	UUIDv7	The game being evaluated
`suggested_age`	integer	The minimum age voters selected
`vote_count`	integer	Number of voters who selected this age

The Game entity includes a derived field:

Field	Type	Description
`community_suggested_age`	integer (nullable)	Median of all age votes

Example: Viticulture

The publisher rates Viticulture at 13+. The community sees it differently:

Suggested Age	Votes
8	34
10	189
12	312
14	87
16	11

The community suggested age is 12 – one year lower than the publisher’s box rating. Despite the wine theme, the gameplay is abstract enough (place workers, collect resources, fill orders) that voters consider the mechanics accessible to a 12-year-old. The publisher’s conservative 13+ likely reflects the thematic subject matter rather than mechanical complexity. The gap between “thematically appropriate” and “mechanically capable” is exactly the kind of nuance the community poll captures.

Accessing Age Poll Data

GET /games/viticulture/age-poll

{
  "game_id": "01967b3c-5a00-7000-8000-000000000092",
  "community_suggested_age": 12,
  "votes": [
    { "suggested_age": 8, "vote_count": 34 },
    { "suggested_age": 10, "vote_count": 189 },
    { "suggested_age": 12, "vote_count": 312 },
    { "suggested_age": 14, "vote_count": 87 },
    { "suggested_age": 16, "vote_count": 11 }
  ],
  "total_votes": 633
}

Community Play Time Data

Community-reported play times are derived from individual play logs. The statistical foundation exposes aggregate data.

Schema

Field	Type	Description
`game_id`	UUIDv7	The game
`total_plays`	integer	Number of plays with reported duration
`min_reported`	integer	Minimum reported play time (minutes)
`max_reported`	integer	Maximum reported play time (minutes)
`median`	integer	Median play time (minutes)
`p10`	integer	10th percentile
`p25`	integer	25th percentile
`p75`	integer	75th percentile
`p90`	integer	90th percentile
`by_player_count`	object	Median play time broken down by player count

Example: Concordia

This data shows what the publisher’s single estimate cannot capture:

The publisher says 100 minutes. The community median is 105 – unusually close for a strategy game.
2-player games are fast (70 min median) – Concordia scales well downward.
5-player games take over twice as long as 2-player (155 vs 70 min) – the scaling factor is dramatic.
The 90th percentile is 160 minutes – some groups spend nearly 3 hours.
The per-player-count breakdown reveals that player count is the dominant factor in play time.

Accessing Play Time Data

GET /games/concordia/community-playtime

{
  "game_id": "01967b3c-5a00-7000-8000-000000000093",
  "total_plays": 6234,
  "min_reported": 40,
  "max_reported": 240,
  "median": 105,
  "p10": 65,
  "p25": 80,
  "p75": 130,
  "p90": 160,
  "by_player_count": {
    "2": { "median": 70, "plays": 2845 },
    "3": { "median": 100, "plays": 1987 },
    "4": { "median": 125, "plays": 1134 },
    "5": { "median": 155, "plays": 268 }
  }
}

Experience Playtime Poll

The experience playtime poll captures community-reported play times bucketed by player experience level. Like PlayerCountRating, it stores raw distributions rather than pre-computed summaries. See ADR-0034.

Schema

Field	Type	Description
`game_id`	UUIDv7	The game being evaluated
`experience_level`	string	`first_play`, `learning`, `experienced`, or `expert`
`median_minutes`	integer	Median reported play time for this level
`min_minutes`	integer	10th percentile play time
`max_minutes`	integer	90th percentile play time
`total_reports`	integer	Number of contributing play reports

Example: Gloomhaven: Jaws of the Lion

Level	Median	p10	p90	Reports
first_play	120 min	90 min	180 min	456
learning	90 min	65 min	130 min	712
experienced	70 min	50 min	100 min	1,534
expert	55 min	40 min	80 min	389

From this data, multipliers are derived: first_play = 120/70 = 1.71, expert = 55/70 = 0.79. This tells consumers that a first scenario of Gloomhaven: Jaws of the Lion takes 71% longer than an experienced play – the tutorial scenarios help, but the card combo system and enemy AI rules create a steep initial learning curve. By expert level, optimized play and familiar monster patterns cut session time significantly.

Accessing Experience Playtime Data

GET /games/gloomhaven-jaws-of-the-lion/experience-playtime

{
  "game_id": "01967b3c-5a00-7000-8000-000000000094",
  "levels": [
    { "experience_level": "first_play", "median_minutes": 120, "min_minutes": 90, "max_minutes": 180, "total_reports": 456 },
    { "experience_level": "learning", "median_minutes": 90, "min_minutes": 65, "max_minutes": 130, "total_reports": 712 },
    { "experience_level": "experienced", "median_minutes": 70, "min_minutes": 50, "max_minutes": 100, "total_reports": 1534 },
    { "experience_level": "expert", "median_minutes": 55, "min_minutes": 40, "max_minutes": 80, "total_reports": 389 }
  ],
  "multipliers": { "first_play": 1.71, "learning": 1.29, "experienced": 1.0, "expert": 0.79 },
  "sufficient_data": true,
  "total_reports": 3091
}

Analytical Questions Enabled

Which games have the steepest learning curve? Sort by first_play multiplier descending – games where the gap between first play and experienced play is largest.
Which games are “easy to learn”? Low first_play multiplier means play time barely changes with experience.
Expert speedrun potential: Which games have the lowest expert multiplier, suggesting the most room for optimization?
Data sufficiency: Which games have enough experience-tagged play logs to produce reliable multipliers?

Expansion Deltas as Analyzable Data

Property modifications and expansion combinations are not just internal data for effective mode – they are exportable entities. See Data Export for how to bulk-download this data.

Interesting analyses enabled by this data:

Average weight increase per expansion: Do expansions tend to make games more complex?
Player count expansion patterns: How often do expansions increase the max player count? By how much?
Playtime inflation: Do expansions make games longer? By what percentage?
Best-at shift: Does adding expansions change which player counts are considered best?

These questions are unanswerable without structured, exportable expansion delta data. OpenTabletop makes them trivial.

OpenTabletop Documentation