diff --git a/AGENTS.md b/AGENTS.md index 5e8cc55..1182e67 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,45 +1,62 @@ -# AGENTS.md — Pub Quiz Dashboard +# AGENTS.md — The Hope Pub Quiz Dashboard ## Overview -Single-page Flask dashboard that reads `data.csv` (one row per quiz night) and renders statistics, a player table, and Plotly charts via a Jinja2 template. No database — the CSV is the sole data store. +Single-page Flask dashboard that reads `data.csv` (one row per quiz night) and renders summary statistics, a player cost table, and five Plotly charts via a Jinja2 template. No database — the CSV is the sole data store. ## Running the App ```bash -uv sync # install dependencies (uses uv.lock / pyproject.toml) -cd src && python app.py # start Flask dev server at http://127.0.0.1:5000 +uv sync # install deps from uv.lock / pyproject.toml +PYTHONPATH=src python src/app.py # run from project root — resolves data.csv correctly ``` -> **Critical:** `data.csv` is read with a bare relative path (`"data.csv"`), so the app **must** be launched from the `src/` directory, where Flask's CWD resolves to the project root via the relative path `../data.csv` — actually, `data.csv` sits at project root and `src/` is the CWD, so Flask resolves it as `../data.csv` would fail. The app reads `data.csv` directly, so run from `src/` only after confirming the path resolves (currently works because Flask's CWD is `src/` and `data.csv` is read as a sibling — verify if moving files). +> **Path gotcha:** `app.py` reads `data.csv` with a bare relative path, so the CWD must be the **project root**. `PYTHONPATH=src` puts `src/` on the import path so local modules resolve. ## Data Shape (`data.csv`) - **One row = one quiz night** -- Columns: `Date` (DD/MM/YYYY), `Absolute Position`, `Relative Position` (float 0–1, where 0=1st, 1=last), `Number of Players`, `Number of Teams`, `Points on Scattergories`, then one binary column per player (1=attended, 0=absent). -- Date parsing is always `dayfirst=True`; the DataFrame is sorted by date on load. +- Columns: `Date` (DD/MM/YYYY), `Absolute Position`, `Relative Position` (float 0–1, 0=1st/best, 1=last), `Number of Players`, `Number of Teams`, `Points on Scattergories`, then one binary column per player (1=attended, 0=absent). +- Date parsing uses `dayfirst=True`; the DataFrame is sorted ascending by date on every load. ## Module Responsibilities | File | Role | |---|---| -| `src/app.py` | Flask routes, all Plotly figure generation, data loading | -| `src/stats.py` | Summary statistics dict (streaks, averages) passed to template as `stats` | -| `src/player_table.py` | List-of-lists table `[header, ...rows..., footer]`; cost fixed at **£3/quiz** per player | -| `src/constants.py` | Single source of truth for player names, regression features, and colour scheme | -| `src/templates/index.html` | Renders `stats` dict, `player_table` list, and `plots` dict of JSON-serialised Plotly figures | +| `src/app.py` | Flask route, five Plotly chart builders, data loading | +| `src/stats.py` | `generate_stats(df)` → `(stats_dict, highlights_list)` tuple | +| `src/player_table.py` | `generate_player_table(df)` → flat list-of-lists; cost hard-coded at **£3/quiz** | +| `src/constants.py` | Player names, regression features, colour scheme, `ordinal(n)` helper | +| `src/templates/index.html` | Renders `highlights` list, `stats` dict, `player_table`, and `plots` dict | ## Key Conventions ### Adding/Removing a Player -1. Update `constants.PLAYER_NAME_COLUMNS` (ordered list — controls display order). -2. Update `constants.FEATURE_COLUMNS` (set — controls regression inputs). -3. Add the new column to `data.csv` with `0`/`1` values. +1. Update `constants.PLAYER_NAME_COLUMNS` (ordered list — controls display order everywhere). +2. Update `constants.FEATURE_COLUMNS` (set — controls which columns feed the regression model). +3. Add the new binary column to `data.csv`. + +### `generate_stats` Return Value +Returns a **tuple** `(stats, highlights)`: +- `highlights` — list of `{"label": str, "value": str, "detail": str}` dicts; rendered as 6 KPI cards. +- `stats` — plain `dict` of human-readable `label: value` pairs; rendered as a secondary list. + +The `index()` route unpacks both: `stats, highlights = generate_stats(df)`. ### Plots Pipeline -Figures are built with Plotly in `app.py`, serialised to JSON with `plotly.utils.PlotlyJSONEncoder`, stored in a `plots` dict keyed by a snake_case name, and rendered client-side via `Plotly.newPlot("{{ key }}", figure.data, figure.layout)` in the template. The dict key becomes both the HTML `id` and the JS variable target — keep keys unique and valid as HTML ids. +1. Build a Plotly figure in `app.py` using `Relative Position` directly (or `Relative Position * 100` for percentile display), where **lower = better**. +2. Serialise: `json.dumps(fig, cls=plotly.utils.PlotlyJSONEncoder)`. +3. Store in the `plots` dict under a **snake_case key** (e.g. `"position_trend"`). +4. The template renders every entry automatically: `Plotly.newPlot("{{ key }}", ...)` — key is both the `
` and JS target. + +Current charts (in render order): `position_trend`, `player_impact`, `scattergories_vs_position`, `player_participation`, `calendar`. ### `player_table` Structure -A flat list-of-lists: index `[0]` = header row, `[1:-1]` = data rows, `[-1]` = footer row. The template iterates `player_table[1:]` for ``, so the footer is rendered as a regular row — style it in CSS if distinction is needed. +`[0]` = header row, `[1:-1]` = data rows sorted by appearances descending, `[-1]` = totals footer. The template uses `player_table[1:-1]` for `` and `player_table[-1]` for ``. -### `Relative Position` Semantics -Lower is better (0 = first place). The trendline in `generate_relative_position_over_time` extrapolates towards `target_value = 0.08` (≈top 8%). Adjust this constant to change the goal projection. +### `Relative Position` Convention +Raw data stores `Relative Position` (0=best). The dashboard keeps this convention everywhere: lower values are better in stats, tables, and chart labels. If a chart uses percentile text, it is `Relative Position * 100` (still lower = better). + +### `ordinal(n)` Helper +Lives in `constants.py`. Returns e.g. `"1st"`, `"22nd"`, `"63rd"`. Import where needed: `from constants import ordinal`. + +### Player Impact Chart +Shows average relative percentile when each player attends. Only players with **>= 3 appearances** are shown (`MIN_APPEARANCES = 3` in `generate_player_impact`). Green bar = below overall average (better); red = above (worse). ## Frontend -No build step. Tailwind CSS and Plotly are loaded from CDN in `index.html`. All styling is inline Tailwind utility classes. - +No build step. Tailwind CSS (`cdn.tailwindcss.com`) and Plotly (`cdn.plot.ly/plotly-3.0.1.min.js`) loaded from CDN. Charts use `{ responsive: true, displayModeBar: false }` config. diff --git a/data.csv b/data.csv index d6178ec..3911e84 100644 --- a/data.csv +++ b/data.csv @@ -1,22 +1,24 @@ -Date,Absolute Position,Relative Position,Number of Players,Number of Teams,Points on Scattergories,Ciaran,Jay,Sam,Drew,Theo,Tom,Ellora,Chloe,Jamie,Christine -17/03/2025,9,0.692,2,13,10,1,1,0,0,0,0,0,0,0,0 -24/03/2025,14,0.933,2,15,4,1,1,0,0,0,0,0,0,0,0 -31/03/2025,8,0.444,4,18,7,1,1,1,1,0,0,0,0,0,0 -07/04/2025,9,0.563,2,16,6,1,1,0,0,0,0,0,0,0,0 -28/04/2025,9,0.529,3,17,5,1,1,1,0,0,0,0,0,0,0 -26/05/2025,6,0.5,2,12,8,0,1,0,0,1,0,0,0,0,0 -02/06/2025,12,0.857,3,14,4,1,1,1,0,0,0,0,0,0,0 -16/06/2025,6,0.545,3,11,6,1,1,1,0,0,0,0,0,0,0 -30/06/2025,5,0.833,4,6,8,1,1,1,0,0,1,0,0,0,0 -14/07/2025,5,0.625,3,8,8,1,1,0,0,0,0,1,0,0,0 -21/07/2025,4,0.5,2,8,9,1,1,0,0,0,0,0,0,0,0 -28/07/2025,6,0.375,4,16,4,1,1,1,0,0,0,1,0,0,0 -04/08/2025,4,0.4,3,10,11,1,1,0,0,0,0,1,0,0,0 -15/09/2025,4,0.333,4,12,5,1,1,1,0,0,0,1,0,0,0 -22/09/2025,5,0.417,3,12,5,1,0,1,0,0,1,0,0,0,0 -29/09/2025,8,0.727,2,11,5,0,1,0,1,0,0,0,0,0,0 -24/11/2025,8,0.889,3,9,6,1,1,0,0,0,0,1,0,0,0 -05/01/2026,4,0.5,4,8,6,1,1,0,0,0,0,0,1,1,0 -26/01/2026,7,0.583,2,12,10,1,1,0,0,0,0,0,0,0,0 -02/02/2026,7,0.7,2,10,5,1,1,0,0,0,0,0,0,0,0 -16/02/2026,9,0.5,3,18,6,0,1,0,1,0,0,0,0,0,1 \ No newline at end of file +Date,Absolute Position,Relative Position,Number of Players,Number of Teams,Points on Scattergories,Ciaran,Jay,Sam,Drew,Theo,Tom,Ellora,Chloe,Jamie,Christine,Mide +17/03/2025,9,0.692,2,13,10,1,1,0,0,0,0,0,0,0,0,0 +24/03/2025,14,0.933,2,15,4,1,1,0,0,0,0,0,0,0,0,0 +31/03/2025,8,0.444,4,18,7,1,1,1,1,0,0,0,0,0,0,0 +07/04/2025,9,0.563,2,16,6,1,1,0,0,0,0,0,0,0,0,0 +28/04/2025,9,0.529,3,17,5,1,1,1,0,0,0,0,0,0,0,0 +26/05/2025,6,0.5,2,12,8,0,1,0,0,1,0,0,0,0,0,0 +02/06/2025,12,0.857,3,14,4,1,1,1,0,0,0,0,0,0,0,0 +16/06/2025,6,0.545,3,11,6,1,1,1,0,0,0,0,0,0,0,0 +30/06/2025,5,0.833,4,6,8,1,1,1,0,0,1,0,0,0,0,0 +14/07/2025,5,0.625,3,8,8,1,1,0,0,0,0,1,0,0,0,0 +21/07/2025,4,0.5,2,8,9,1,1,0,0,0,0,0,0,0,0,0 +28/07/2025,6,0.375,4,16,4,1,1,1,0,0,0,1,0,0,0,0 +04/08/2025,4,0.4,3,10,11,1,1,0,0,0,0,1,0,0,0,0 +15/09/2025,4,0.333,4,12,5,1,1,1,0,0,0,1,0,0,0,0 +22/09/2025,5,0.417,3,12,5,1,0,1,0,0,1,0,0,0,0,0 +29/09/2025,8,0.727,2,11,5,0,1,0,1,0,0,0,0,0,0,0 +24/11/2025,8,0.889,3,9,6,1,1,0,0,0,0,1,0,0,0,0 +05/01/2026,4,0.5,4,8,6,1,1,0,0,0,0,0,1,1,0,0 +26/01/2026,7,0.583,2,12,10,1,1,0,0,0,0,0,0,0,0,0 +02/02/2026,7,0.7,2,10,5,1,1,0,0,0,0,0,0,0,0,0 +16/02/2026,9,0.5,3,18,6,0,1,0,1,0,0,0,0,0,1,0 +23/02/2026,6,0.6,2,10,10,1,1,0,0,0,0,0,0,0,0,0 +09/03/2026,6,1,5,6,10,1,1,1,0,0,0,0,1,0,0,1 \ No newline at end of file diff --git a/src/app.py b/src/app.py index c2b1f0d..26ae5fc 100644 --- a/src/app.py +++ b/src/app.py @@ -2,10 +2,10 @@ from flask import Flask, render_template import pandas as pd import plotly.express as px import plotly.graph_objects as go +import plotly.utils import statsmodels.api as sm import numpy as np import datetime -from sklearn.linear_model import LinearRegression import json from stats import generate_stats from player_table import generate_player_table @@ -13,6 +13,9 @@ import constants app = Flask(__name__) +# --------------------------------------------------------------------------- +# Data loading +# --------------------------------------------------------------------------- def get_data_frame(filename): df = pd.read_csv(filename) @@ -22,236 +25,383 @@ def get_data_frame(filename): def build_hovertext(df, attendance_columns): - return df[attendance_columns].apply( - lambda row: ", ".join( - [ - player - for player in attendance_columns - if row[player] == 1 - ] - ) or "No attendance", - axis=1 + present = [c for c in attendance_columns if c in df.columns] + return df[present].apply( + lambda row: ", ".join(p for p in present if row[p] == 1) or "No attendance", + axis=1, ) -def generate_weekly_attendance_calendar(df): - # Compute ISO year/week and attendance - df["Year"] = df["Date"].dt.isocalendar().year - df["Week"] = df["Date"].dt.isocalendar().week +# --------------------------------------------------------------------------- +# Charts +# --------------------------------------------------------------------------- - attendee_columns = [ - col for col in df.columns if col not in { - "Date", "Relative Position", "Number of Players", - "Number of Teams", "Attendees", "Year", "Week", "Year-Week" - } - ] - - df["Attended"] = df[attendee_columns].sum(axis=1) > 0 - weekly_attendance = df.groupby(["Year", "Week"])[ - "Attended"].any().astype(int).reset_index() - - # Build full year/week grid - all_years = sorted(df["Year"].unique()) - all_weeks = list(range(1, 53)) - - grid = [] - for year in all_years: - for week in all_weeks: - grid.append({"Year": year, "Week": week}) - - calendar = pd.DataFrame(grid) - calendar = calendar.merge(weekly_attendance, on=[ - "Year", "Week"], how="left").fillna(0) - calendar["Attended"] = calendar["Attended"].astype(int) - - # Plot - fig = go.Figure(data=go.Heatmap( - x=calendar["Week"], - y=calendar["Year"], - z=calendar["Attended"], - colorscale=constants.ATTENDANCE_COLORSCHEME, - zmin=0, - zmax=1, - showscale=False - )) - - fig.update_layout( - title="Pub Quiz Attendance Calendar (Weekly)", - xaxis_title="Week Number", - yaxis_title="Year", - xaxis=dict(tickmode="linear", dtick=4), - template="plotly_white", - height=180 + len(all_years) * 40 - ) - - return fig - - -def generate_relative_position_over_time(df): +def generate_position_trend(df): + """ + Line chart of relative position percentile over time (lower is better). + Overlays a 5-game rolling average and an extended OLS trendline + projected to the top-8th-percentile target. + """ + df = df.copy() df["Date_ordinal"] = df["Date"].map(pd.Timestamp.toordinal) + df["Relative Percentile"] = df["Relative Position"] * 100 + df["Rolling Avg (5)"] = df["Relative Percentile"].rolling(5, min_periods=1).mean() + df["Attendees"] = build_hovertext(df, constants.PLAYER_NAME_COLUMNS) X = sm.add_constant(df["Date_ordinal"]) - y = df["Relative Position"] - - model = sm.OLS(y, X).fit() - df["BestFit"] = model.predict(X) - + model = sm.OLS(df["Relative Percentile"], X).fit() intercept = model.params["const"] slope = model.params["Date_ordinal"] - target_value = 0.08 + target_percentile = 8.0 + min_ord = df["Date_ordinal"].min() + max_ord = df["Date_ordinal"].max() - predicted_ordinal = (target_value - intercept) / slope + predicted_ordinal = None + if slope < 0: + predicted_ordinal = (target_percentile - intercept) / slope - min_ordinal = df["Date_ordinal"].min() - max_ordinal = df["Date_ordinal"].max() + end_ord = max(max_ord, predicted_ordinal) if predicted_ordinal and predicted_ordinal > max_ord else max_ord - if predicted_ordinal > max_ordinal: - extended_ordinals = np.linspace(min_ordinal, predicted_ordinal, 100) - else: - extended_ordinals = np.linspace(min_ordinal, max_ordinal, 100) + extended_ords = np.linspace(min_ord, end_ord, 200) + extended_percentile = intercept + slope * extended_ords + extended_dates = [datetime.date.fromordinal(int(x)) for x in extended_ords] - extended_bestfit = intercept + slope * extended_ordinals + fig = go.Figure() - extended_dates = [datetime.date.fromordinal( - int(x)) for x in extended_ordinals] + fig.add_scatter( + x=df["Date"], + y=df["Relative Percentile"], + mode="lines+markers", + name="Result", + line=dict(color="#1e3a8a", width=1.5), + marker=dict(size=6, color="#1e3a8a"), + customdata=df["Attendees"], + hovertemplate="%{x|%d %b %Y}
Relative percentile: %{y:.0f}th
Squad: %{customdata}", + ) - df["Attendees"] = build_hovertext(df, constants.PLAYER_NAME_COLUMNS) - - fig = px.line( - df, - x="Date", - y="Relative Position", - title="Quiz Position Over Time with Extended Trendline", - hover_data={"Attendees": True} + fig.add_scatter( + x=df["Date"], + y=df["Rolling Avg (5)"], + mode="lines", + name="5-Game Avg", + line=dict(color="#f59e0b", width=2.5), + hovertemplate="%{x|%d %b %Y}
5-Game Avg: %{y:.0f}%", ) fig.add_scatter( x=extended_dates, - y=extended_bestfit, + y=extended_percentile, mode="lines", - name="Extended Trendline", - line=dict(dash="dot", color="red") + name="Trend", + line=dict(dash="dot", color="#dc2626", width=1.5), + hoverinfo="skip", ) - fig.update_yaxes(range=[0, 1], tickformat=".2f") + if predicted_ordinal and predicted_ordinal > max_ord: + target_date = datetime.date.fromordinal(int(predicted_ordinal)) + fig.add_annotation( + x=target_date, + y=target_percentile, + text=f"8th percentile target: {target_date.strftime('%b %Y')}", + showarrow=True, + arrowhead=2, + font=dict(size=11, color="#dc2626"), + ) + + fig.add_hline( + y=50, + line_dash="dot", + line_color="#9ca3af", + annotation_text="50th percentile", + annotation_position="bottom right", + ) + + fig.update_layout( + title="Relative Position Over Time", + xaxis_title="Date", + yaxis=dict(title="Relative Position Percentile (lower is better)", range=[0, 100], ticksuffix="th"), + template="plotly_white", + legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1), + hovermode="x unified", + margin=dict(t=60, b=40), + ) return fig -def generate_visualisations(df): - feature_columns = [ - col - for col in df.columns - if col in constants.FEATURE_COLUMNS +def generate_player_impact(df): + """ + Horizontal bar chart: average relative position percentile when each player attends. + Only shows players with >= 3 appearances. + Green bar = lower than overall average (better); red = higher (worse). + """ + MIN_APPEARANCES = 3 + overall_percentile = df["Relative Position"].mean() * 100 + + rows = [] + for name in constants.PLAYER_NAME_COLUMNS: + if name not in df.columns: + continue + attended = df[df[name] == 1] + n = len(attended) + if n >= MIN_APPEARANCES: + percentile = attended["Relative Position"].mean() * 100 + rows.append({"Player": name, "Relative Percentile": round(percentile, 1), "Games": n}) + + if not rows: + return go.Figure() + + impact_df = pd.DataFrame(rows).sort_values("Relative Percentile", ascending=True) + colors = [ + "#16a34a" if p <= overall_percentile else "#dc2626" + for p in impact_df["Relative Percentile"] ] - x = df[feature_columns] - y = df["Relative Position"] - model = LinearRegression() - model.fit(x, y) - - plots = {} - - plots["relative_pos_over_time"] = json.dumps( - generate_relative_position_over_time(df), - cls=plotly.utils.PlotlyJSONEncoder + fig = go.Figure( + go.Bar( + x=impact_df["Relative Percentile"], + y=impact_df["Player"], + orientation="h", + marker_color=colors, + text=[ + f"{constants.ordinal(round(p))} ({g} games)" + for p, g in zip(impact_df["Relative Percentile"], impact_df["Games"]) + ], + textposition="outside", + hovertemplate="%{y}
Avg relative percentile: %{x:.1f}th", + ) ) - df_line = df.melt( - id_vars="Date", - value_vars=["Absolute Position", "Number of Teams"], - var_name="Metric", - value_name="Value" - ) - fig11 = px.line( - df_line, - x='Date', - y='Value', - color='Metric', - title='Absolute Position and Total Number of Teams Over Time' - ) - plots["absolute_pos_over_time"] = json.dumps( - fig11, cls=plotly.utils.PlotlyJSONEncoder + fig.add_vline( + x=overall_percentile, + line_dash="dot", + line_color="#6b7280", + annotation_text=f"Overall avg ({constants.ordinal(round(overall_percentile))})", + annotation_position="top right", ) -# 2. Number of players vs position with regression line - fig2 = px.scatter( + fig.update_layout( + title="Who Helps Most - Avg. Relative Position When Attending", + xaxis=dict( + title="Avg. Relative Position Percentile (lower is better)", + range=[0, 100], + ticksuffix="th", + ), + yaxis=dict(title="", autorange="reversed"), + template="plotly_white", + showlegend=False, + height=max(300, len(rows) * 52), + margin=dict(t=60, b=40, r=20), + ) + return fig + + +def generate_scattergories_chart(df): + """ + Scatter of Scattergories points vs relative position percentile with OLS trendline. + Negative slope = scoring more in Scattergories correlates with better finish. + """ + df = df.copy() + df["Relative Percentile"] = df["Relative Position"] * 100 + + fig = px.scatter( df, - x="Number of Players", - y="Relative Position", + x="Points on Scattergories", + y="Relative Percentile", trendline="ols", - title="Players vs Position (%)", + title="Scattergories vs Relative Position", + labels={ + "Points on Scattergories": "Scattergories Points", + "Relative Percentile": "Relative Position Percentile (lower is better)", + }, + hover_data={"Relative Percentile": ":.1f"}, ) - fig2.update_xaxes(dtick=1) - plots["players_vs_position"] = json.dumps( - fig2, cls=plotly.utils.PlotlyJSONEncoder) + fig.update_traces( + marker=dict(color="#1e3a8a", size=9, opacity=0.8), + selector=dict(mode="markers"), + ) + fig.update_traces( + line=dict(color="#dc2626", dash="dot", width=2), + selector=dict(type="scatter", mode="lines"), + ) + fig.update_layout( + template="plotly_white", + yaxis=dict(ticksuffix="th", range=[0, 100]), + xaxis=dict(dtick=1), + margin=dict(t=60, b=40), + ) + return fig -# 3. Player participation heatmap - df_players = df[constants.PLAYER_NAME_COLUMNS] - fig3 = px.imshow( + +def generate_player_participation(df): + """Heatmap of which player attended which game.""" + player_cols = [c for c in constants.PLAYER_NAME_COLUMNS if c in df.columns] + df_players = df[player_cols] + fig = px.imshow( df_players.T, - labels=dict(x="Games", y="Player", color="Present"), - title="Player Participation Heatmap", + labels=dict(x="Game", y="Player", color="Attended"), + title="Player Attendance by Game", color_continuous_scale=constants.ATTENDANCE_COLORSCHEME, zmin=0, zmax=1, - aspect="auto" + aspect="auto", ) - fig3.update_coloraxes( + fig.update_coloraxes( colorbar=dict( tickvals=[0, 1], ticktext=["Absent", "Present"], lenmode="pixels", - len=300, + len=200, ) ) - fig3.update_layout( - template="seaborn", - height=600, + fig.update_layout( + template="plotly_white", + height=max(300, len(player_cols) * 40 + 100), yaxis=dict( tickmode="array", - tickvals=list(range(len(df_players.columns))), - ticktext=df_players.columns + tickvals=list(range(len(player_cols))), + ticktext=player_cols, + ), + margin=dict(t=60, b=40), + ) + return fig + + +def generate_weekly_attendance_calendar(df): + """Compact weekly attendance heatmap (13-column grid blocks per year).""" + df = df.copy() + df["Year"] = df["Date"].dt.isocalendar().year + df["Week"] = df["Date"].dt.isocalendar().week + + attendee_columns = [ + col + for col in df.columns + if col + not in { + "Date", + "Relative Position", + "Number of Players", + "Number of Teams", + "Attendees", + "Year", + "Week", + "Year-Week", + "Absolute Position", + "Points on Scattergories", + } + ] + + df["Attended"] = (df[attendee_columns].sum(axis=1) > 0).astype(int) + weekly = df.groupby(["Year", "Week"])["Attended"].max().reset_index() + + # Build a compact matrix: 13 columns, 4 (or 5 for week 53) rows per year. + all_years = sorted(df["Year"].unique()) + max_week = int(df["Week"].max()) + rows_per_year = 5 if max_week == 53 else 4 + + grid_rows = [] + for year in all_years: + for block in range(1, rows_per_year + 1): + for col in range(1, 14): + week = (block - 1) * 13 + col + if week > 53: + continue + grid_rows.append( + { + "Year": year, + "Block": block, + "Col": col, + "Week": week, + } + ) + + calendar = pd.DataFrame(grid_rows).merge( + weekly, + on=["Year", "Week"], + how="left", + ) + calendar["Attended"] = calendar["Attended"].fillna(0).astype(int) + + y_labels = [] + for year in all_years: + for block in range(1, rows_per_year + 1): + start = (block - 1) * 13 + 1 + end = min(block * 13, 53) + y_labels.append(f"{year} · W{start}-{end}") + + calendar["RowLabel"] = calendar.apply( + lambda r: f"{int(r['Year'])} · W{(int(r['Block']) - 1) * 13 + 1}-{min(int(r['Block']) * 13, 53)}", + axis=1, + ) + + z_matrix = [] + text_matrix = [] + for label in y_labels: + row = calendar[calendar["RowLabel"] == label].sort_values("Col") + z_matrix.append(row["Attended"].tolist()) + text_matrix.append([f"ISO week {int(w)}" for w in row["Week"]]) + + fig = go.Figure( + data=go.Heatmap( + x=list(range(1, 14)), + y=y_labels, + z=z_matrix, + text=text_matrix, + hovertemplate="%{y}
Column %{x}
%{text}
Attended: %{z}", + colorscale=constants.ATTENDANCE_COLORSCHEME, + zmin=0, + zmax=1, + showscale=False, + xgap=3, + ygap=3, ) ) - plots["player_participation"] = json.dumps( - fig3, cls=plotly.utils.PlotlyJSONEncoder) -# 4. Calendar view - plots["calendar"] = json.dumps( - generate_weekly_attendance_calendar(df), - cls=plotly.utils.PlotlyJSONEncoder + fig.update_layout( + title="Weekly Attendance Calendar", + xaxis=dict(title="Week Column (1-13)", tickmode="linear", dtick=1), + yaxis_title="Year / Week Range", + template="plotly_white", + height=180 + len(y_labels) * 34, + margin=dict(t=60, b=40), ) + return fig -# 5. Coefficient bar chart - coefficients = pd.Series(model.coef_, index=x.columns).sort_values() - fig5 = px.bar( - coefficients, - orientation="h", - labels={"value": "Coefficient", "index": "Feature"}, - title="Linear Regression Coefficients", - ) - plots["coefficients"] = json.dumps( - fig5, cls=plotly.utils.PlotlyJSONEncoder) - return plots +# --------------------------------------------------------------------------- +# Visualisation bundle +# --------------------------------------------------------------------------- +def generate_visualisations(df): + enc = plotly.utils.PlotlyJSONEncoder + return { + "position_trend": json.dumps(generate_position_trend(df), cls=enc), + "player_impact": json.dumps(generate_player_impact(df), cls=enc), + "scattergories_vs_position": json.dumps(generate_scattergories_chart(df), cls=enc), + "player_participation": json.dumps(generate_player_participation(df), cls=enc), + "calendar": json.dumps(generate_weekly_attendance_calendar(df), cls=enc), + } + + +# --------------------------------------------------------------------------- +# Routes +# --------------------------------------------------------------------------- @app.route("/") def index(): df = get_data_frame("data.csv") - stats = generate_stats(df) + stats, highlights = generate_stats(df) player_table = generate_player_table(df) plots = generate_visualisations(df) return render_template( "index.html", plots=plots, stats=stats, - player_table=player_table + highlights=highlights, + player_table=player_table, ) if __name__ == "__main__": - import plotly app.run(debug=True) diff --git a/src/constants.py b/src/constants.py index 8b86a46..a2e4568 100644 --- a/src/constants.py +++ b/src/constants.py @@ -1,3 +1,13 @@ +def ordinal(n): + """Return an ordinal string for integer n, e.g. 1 → '1st', 22 → '22nd'.""" + n = int(n) + if 11 <= (n % 100) <= 13: + suffix = "th" + else: + suffix = {1: "st", 2: "nd", 3: "rd"}.get(n % 10, "th") + return f"{n}{suffix}" + + PLAYER_NAME_COLUMNS = [ "Ciaran", "Jay", @@ -8,7 +18,8 @@ PLAYER_NAME_COLUMNS = [ "Ellora", "Chloe", "Jamie", - "Christine" + "Christine", + "Mide", ] FEATURE_COLUMNS = { @@ -24,9 +35,10 @@ FEATURE_COLUMNS = { "Ellora", "Chloe", "Jamie", - "Christine" + "Christine", + "Mide", } ATTENDANCE_COLORSCHEME = [ - [0, "#E4ECF6"], [1, "#636EFA"] + [0, "#E4ECF6"], [1, "#1e3a8a"] ] diff --git a/src/player_table.py b/src/player_table.py index edc2d28..8a7460f 100644 --- a/src/player_table.py +++ b/src/player_table.py @@ -1,20 +1,25 @@ -from constants import PLAYER_NAME_COLUMNS +from constants import PLAYER_NAME_COLUMNS, ordinal def generate_player_table(df): - header = [["Name", "Appearances", "Spent"]] + header = [["Player", "Appearances", "Avg. Relative Percentile", "Spent"]] + + player_stats = [] + for name in PLAYER_NAME_COLUMNS: + if name in df.columns: + attended = df[df[name] == 1] + n = len(attended) + if n > 0: + avg_rel = attended["Relative Position"].mean() + player_stats.append((name, n, avg_rel)) - player_stats = [ - (name, df[name].sum()) - for name in df.columns - if name in PLAYER_NAME_COLUMNS - ] player_stats.sort(key=lambda x: x[1], reverse=True) - total = sum(n for name, n in player_stats) - body = [ - [name, n, f"£{n * 3:.2f}"] - for name, n in player_stats - ] - footer = [["Total", total, f"£{total * 3:.2f}"]] + total = sum(n for _, n, _ in player_stats) - return (header + body + footer) + body = [ + [name, n, ordinal(round(avg_rel * 100)), f"£{n * 3:.2f}"] + for name, n, avg_rel in player_stats + ] + footer = [["Total", total, "—", f"£{total * 3:.2f}"]] + + return header + body + footer diff --git a/src/stats.py b/src/stats.py index 36a6d39..a6dc8bf 100644 --- a/src/stats.py +++ b/src/stats.py @@ -1,53 +1,92 @@ import constants +from constants import ordinal -def get_max_team_streak(dates): +def _max_team_streak(dates): max_streak = current_streak = 1 for i in range(1, len(dates)): - if (dates[i] - dates[i-1]).days == 7: + if (dates[i] - dates[i - 1]).days == 7: current_streak += 1 else: current_streak = 1 - max_streak = max(current_streak, max_streak) - - # TODO: Show dates - return f"{max_streak} weeks" + max_streak = max(max_streak, current_streak) + return max_streak -def get_max_player_streak(df): - names = [col for col in df.columns if col in constants.PLAYER_NAME_COLUMNS] - max_streak = 1 - max_name = names[0] +def _max_player_streak(df): + names = [col for col in df.columns if col in set(constants.PLAYER_NAME_COLUMNS)] + max_streak, max_name = 1, names[0] for name in names: - local_max = 1 - current = 0 - for attendance in df[name]: - if attendance: + local_max = current = 0 + for att in df[name]: + if att: current += 1 else: local_max = max(local_max, current) current = 0 - local_max = max(local_max, current) if local_max > max_streak: - max_streak = local_max - max_name = name - - return f"{max_streak} ({max_name})" + max_streak, max_name = local_max, name + return max_streak, max_name def generate_stats(df): - stats = {} - n = len(df["Date"]) + n = len(df) + avg_rel = df["Relative Position"].mean() + last5_avg = df.tail(5)["Relative Position"].mean() + top_half = int((df["Relative Position"] < 0.5).sum()) - stats["Number of quizes played"] = n - avg_players = df["Number of Players"].sum() / n - stats["Average number of players"] = f"{avg_players:.2f}" - avg_teams = df["Number of Teams"].sum() / n - stats["Average number of teams"] = f"{avg_teams:.2f}" - p = df["Relative Position"].sum()*100 / n - stats["Average relative position"] = f"{p:.2f}th percentile" - stats["Max consecutive week streak"] = get_max_team_streak(df["Date"]) - stats["Max consecutive player streak"] = get_max_player_streak(df) + best_rel_idx = df["Relative Position"].idxmin() + best_abs = int(df.loc[best_rel_idx, "Absolute Position"]) + best_teams = int(df.loc[best_rel_idx, "Number of Teams"]) - return stats + team_streak = _max_team_streak(list(df["Date"])) + player_streak, streak_player = _max_player_streak(df) + + improving = last5_avg < avg_rel + form_arrow = "↓" if improving else "↑" + form_label = "improving" if improving else "declining" + + highlights = [ + { + "label": "Quizzes Played", + "value": str(n), + "detail": f"Since {df['Date'].iloc[0].strftime('%b %Y')}", + }, + { + "label": "Avg. Relative Percentile", + "value": ordinal(round(avg_rel * 100)), + "detail": "Lower = better finishing position", + }, + { + "label": "Best Finish", + "value": f"{ordinal(best_abs)} place", + "detail": f"Out of {best_teams} teams", + }, + { + "label": "Recent Form", + "value": ordinal(round(last5_avg * 100)), + "detail": f"{form_arrow} {form_label} vs overall (lower is better)", + }, + { + "label": "Top-Half Finishes", + "value": str(top_half), + "detail": f"{top_half / n * 100:.0f}% of all games", + }, + { + "label": "Best Streak", + "value": f"{team_streak} wks", + "detail": "Consecutive weeks attended", + }, + ] + + stats = { + "Top-half finishes": f"{top_half} of {n} ({top_half / n * 100:.0f}%)", + "Average teams per night": f"{df['Number of Teams'].mean():.1f}", + "Average squad size": f"{df['Number of Players'].mean():.1f} players", + "Average Scattergories score": f"{df['Points on Scattergories'].mean():.1f} pts", + "Best Scattergories score": f"{int(df['Points on Scattergories'].max())} pts", + "Longest individual streak": f"{player_streak} weeks ({streak_player})", + } + + return stats, highlights diff --git a/src/templates/index.html b/src/templates/index.html index ce87fd0..26707bf 100644 --- a/src/templates/index.html +++ b/src/templates/index.html @@ -1,79 +1,125 @@ - + - Pub Quiz Dashboard - - + + + The Hope Pub Quiz + + - -

📊 Pub Quiz Dashboard

+ -
-

Stats

-
-
-
    - {% for key, data in stats.items() %} -
  • - {{ key }}: - - {{ data }} -
  • - {% endfor %} -
-
-
+ +
+
+

Performance Dashboard

+

🍺 The Hope Pub Quiz

+

Trying to convince ourselves we're not stupid

+
-
- - - - {% for heading in player_table[0] %} - - {% endfor %} - - - - {% for row in player_table[1:] %} - - {% for data in row %} - - {% endfor %} - - {% endfor %} - -
- {{ heading }} -
- {{ data }} -
-
+
-
-

Plots

- {% for key, plot in plots.items() %} -
-
- + +
+

At a Glance

+
+ {% for h in highlights %} +
+ {{ h.value }} + {{ h.label }} + {{ h.detail }}
{% endfor %} +
+
+ + +
+ + +
+

More Stats

+
+ {% for key, value in stats.items() %} +
+
{{ key }}
+
{{ value }}
+
+ {% endfor %} +
+
+ + +
+

Squad

+ + + + {% for heading in player_table[0] %} + + {% endfor %} + + + + {% for row in player_table[1:-1] %} + + {% for data in row %} + + {% endfor %} + + {% endfor %} + + + + {% for data in player_table[-1] %} + + {% endfor %} + + +
+ {{ heading }} +
+ {{ data }} +
+ {{ data }} +
+
+ +
+ + +
+

Charts

+
+ {% for key, plot in plots.items() %} +
+
+ +
+ {% endfor %} +
+
+ +
+ + +