*** Board ***
Longevity Board of Advisors – Agent PoC Roadmap
Goal: Offline, open-source, AI-powered “mini-Board” that turns Martin’s latest biomarkers into a 2-sentence, evidence-based next step in < 5 s.
Scope: 4-week delivery split into four incremental releases (MVP → Alpha → Beta → v1.0). Each milestone is self-contained and demo-ready.
Release 0 – “Friday-Night MVP” (Days 1-2)
Purpose: Prove the concept end-to-end with zero cloud spend.
Task | Owner | Deliverable | Done? |
---|---|---|---|
1. Local LLM runtime | Dev | llama-cpp-python + 4-bit Mistral-7B GGUF running on CPU |
☐ |
2. Universal agent template | Dev | agents/template.py (100 lines) – only SYSTEM prompt changes |
☐ |
3. 3 Founding agents | Dev | sinclair.py , kirkland.py , fahy.py |
☐ |
4. Orchestrator (Natasha) | Dev | board.py – sequential call + append to Board_Minutes.md |
☐ |
5. Input stub | Dev | data/biomarkers.md (markdown table of last quarter) |
☐ |
6. Runbook | Dev | One-liner: python board.py → console + markdown log |
☐ |
Exit criteria:
- Repo public; README shows screenshot of terminal output.
- Recommendation prints in < 5 s on M1/8 GB RAM.
Release 1 – “Alpha” (Week 1)
Purpose: Add consensus layer + single-file executable.
Task | Owner | Deliverable | Done? |
---|---|---|---|
1. Consensus agent | Dev | agents/natasha.py → 2-sentence action plan |
☐ |
2. Rich CLI | Dev | Colour-coded agents; progress spinner | ☐ |
3. Package | Dev | pyproject.toml → pip install -e . creates longevity-board cmd |
☐ |
4. GitHub release | Dev | Tag v0.1.0 + zipped binaries for macOS & Win |
☐ |
Exit criteria:
- Non-technical user downloads release, double-clicks → sees neat plan.
Release 2 – “Beta” (Weeks 2-3)
Purpose: Memory, tools, and first biomarker math.
Task | Owner | Deliverable | Done? |
---|---|---|---|
1. ChromaDB memory | Dev | Embed every past minute; retrieve 3 most similar cases before each prompt | ☐ |
2. Calculator tool | Dev | tools/calculator.py – LLM writes equation → safe-eval with asteval |
☐ |
3. Biomarker delta tool | Dev | Auto-calculate Δ% vs last quarter | ☐ |
4. YAML config | Dev | config.yaml – model path, temperature, agent list |
☐ |
5. Unit tests | Dev | pytest ≥ 80 % coverage on tools & orchestrator | ☐ |
Exit criteria:
- CI green; demo shows retrieved past case influencing current advice.
Release 3 – “v1.0” (Week 4)
Purpose: GUI, multi-model support, dockerised.
Task | Owner | Deliverable | Done? |
---|---|---|---|
1. Gradio UI | Dev | Drag-and-drop CSV → instant consensus card (markdown + download) | ☐ |
2. Ollama backend | Dev | Support any GGUF in ~/.ollama ; fallback to bundled model |
☐ |
3. Docker image | Dev | docker run -p 7860:7860 longevity-board |
☐ |
4. Security audit | Dev | bandit + safety pass; no arbitrary code exec outside calculator |
☐ |
5. Docs | Dev | MkDocs site – user guide, developer guide, roadmap | ☐ |
6. Board sign-off | Martin | Run live with latest panel; approve accuracy vs human Board Slack thread | ☐ |
Exit criteria:
- Public v1.0 tag; UI hosted locally on Martin’s NUC; docker-compose stacks for future cloud move.
Future Backlog (post-v1.0, priority order)
Tool-calling suite
- Search PubMed (
tools/pubmed.py
) – top-3 abstracts injected into context. - Wolfram Alpha for unit conversions (nmol/L ↔ ng/dL).
- Search PubMed (
Multi-agent debate loop
- Round-robin critique: each agent refutes previous advice, Natasha summarises.
- Round-robin critique: each agent refutes previous advice, Natasha summarises.
Persistent user profile
- SQLite or Supabase schema for interventions, doses, outcomes.
- SQLite or Supabase schema for interventions, doses, outcomes.
Auto-scheduler
- Read calendar → propose fastest compatible lab draw date; export .ics reminder.
- Read calendar → propose fastest compatible lab draw date; export .ics reminder.
Cloud optional mode
- Deploy as Fly.io micro-app with auth, still keeping full local fallback.
Roles & Rituals
Role | Who | Duty |
---|---|---|
Product Owner | Martin Carroll | Accepts/rejects stories, owns data/biomarkers.md |
Tech Lead | (assign) | Keeps roadmap updated, cuts releases |
QA | (assign) | Signs off exit criteria, maintains test suite |
Board Experts | Sinclair, Kirkland… | Review SYSTEM prompts every quarter for scientific drift |
Cadence: 15-min stand-up Mon & Thu; retro at each release tag.
Metrics of Success
Metric | MVP | v1.0 Target |
---|---|---|
Time-to-recommendation | < 5 s | < 2 s |
Hallucination rate* | ≤ 15 % | ≤ 5 % |
Unit-test coverage | 0 % | ≥ 80 % |
Docker image size | n/a | < 800 MB |
UI uptime (local) | n/a | 99 % over 7 d |
*Sample 20 outputs, human Board flags unreferenced or false claim.