FixFirst Edge — Offline multimodal maintenance copilot

Why this exists

Every competitor in the category assumes the cloud.

Cloud-first maintenance copilots fail when the signal drops. LLM-generated diagnostics cannot be traced back to a specific page. For pharma, defense, utilities, and air-gapped sites, shipping equipment data to a third-party model is a disqualifier. FixFirst Edge keeps the corpus, the embeddings, the index, and the retrieval on one laptop.

	Typical cloud CMMS Augury · MaintainX · UpKeep · Fiix	FixFirst Edge
Works offline	Degraded or read-only	Full functionality
Where queries run	Vendor cloud	Technician's laptop
Data leaves the device	Every query	Never
Diagnosis method	LLM-generated prose	Templated from retrieved rows
Answer traceability	Partial or none	Every answer cites a page
Pricing	$35–$200 / user / month	Free, MIT licensed

Get started

Three ways to try it. Pick the one that fits your minute.

1 0 install · 2 minutes

Watch the demo

See the full offline pipeline answer a real fault code, classify a photo of a damaged part, and transcribe a voice note — all on a laptop with WiFi off.

youtu.be/eKGRRkdDurA →

2 0 install · in your browser

Try the live retrieval

Type a query against the real demo corpus right here on this page. The widget runs in your browser and reports the actual measured latency — no install, no signup.

scroll to the widget ↓

3 ~5 minutes · one command

Run it locally

Boot the full stack — Actian + backend + frontend — with a single docker compose. No Python venv, no node install, no setup beyond Docker on your machine.

docker compose --profile full up

full quickstart on GitHub →

See it work

One query. Three panels of evidence. 843 milliseconds.

1 describe the problem

Type, upload, or speak

Error code, machine model, or symptom in plain English. Photo of a damaged part. Voice note. Any combination — all three feed the same retrieval pipeline.

typed

E04 motor overload on CX-200

2 find evidence

Two lanes in parallel

Dense ANN searches semantically across all 68 documents. A second lane applies exact-match identifier filters. Reciprocal rank fusion merges both.

dense ANN → 12 hits

id-filtered → 4 hits

→ RRF merge

3 read the answer

Three cited panels

Composed from rows, not written by an LLM. The relevant manual page, the closest past incident, and the likely replacement part — each with a traceable citation.

manual · incident · part

843ms · 0 B egress

localhost:3001 — fixfirst-edge, on laptop

captured · airplane mode

FixFirst Edge frontend on localhost with the OFFLINE - Ready locally pill, search bar with E04 motor overload, image/WAV drop zone, and the diagnosis panel ready for input

Under the hood

Expand any row for the depth.

Everything judges and skeptical CTOs need to verify the claims — collapsed by default so the page stays readable.

The three Actian features that make this workload possible +

01

Named vectors

One collection, three embedding spaces — text_vec (384d), image_vec (512d), audio_text_vec (384d). Any subset, any cross-retrieval.

vectors_config={
  "text_vec":       VectorParams(size=384),
  "image_vec":      VectorParams(size=512),
  "audio_text_vec": VectorParams(size=384),
}

02

Filtered search

Six keyword-indexed fields. In-engine at Actian, not post-filtered in Python.

builder = FilterBuilder()
builder.must(Field("doc_type").eq("manual"))
builder.must(Field("model_no").eq("CX-200"))

03

Hybrid RRF fusion

Dense ANN and identifier-filtered ANN run in parallel. Reciprocal rank fusion merges both.

# reciprocal rank fusion
score(d) = Σ 1 / (60 + rank_i(d))
# i ∈ {dense, id-filtered}

Architecture — one process tree, bound to localhost +

architecture · request path all edges = localhost

Benchmark — reproducible p50 843ms, p95 1093ms +

End-to-end /api/diagnose latency, warmed, over 20 mixed queries (error codes, part numbers, symptom phrases, machine IDs) against the full demo corpus on a 16 GB consumer laptop, CPU-only, WSL2 + Docker.

p50 warm

3-run median

843ms

9 vector ops per query — query embed, dense ANN, id-filtered ANN, three filter-scoped retrievals, merge, score, render.

p95 warm

tightest run

1093ms

Corpus: 3 PDF manuals, 30 incidents, 25 parts, 13 error codes, 6 schematics, 5 voice notes. Tight cluster across 3 consecutive runs.

Reproduce with scripts/bench_diagnose.py.

Where the 843 ms goes

0 — 12 ms

Query embed

bge-small produces a 384-dim text vector on CPU.

12 — 198 ms

Two ANN lanes

Dense ANN across all docs + identifier-filtered ANN in parallel.

198 — 610 ms

RRF + slot fills

Reciprocal rank fusion, then three filter-scoped retrievals for manual, incident, part.

610 — 843 ms

Compose + render

Template populates from retrieved rows. No LLM in this step.

9 vector ops per user query, zero network calls, every step reproducible.

Open source posture — MIT, zero telemetry, self-hostable +

MIT-licensed end to end — backend, frontend, ingestion, Docker compose. No usage reporting, no license server, no upstream the operator has to trust. The deployment is an artifact you can read in an afternoon.

Ridwannurudeen / fixfirst-edge

public

Full stack bootdocker compose up
Tests19 passing
LicenseMIT
Telemetry endpoints0

Run it

One command. Your laptop.

~/fixfirst-edge · start up bash

run once, from the repo root

./start.sh

Boots the Actian DB container, the backend, the corpus ingest on first run, and the frontend. Opens your browser. Subsequent runs skip anything already in place.

to stop

./stop.sh

Prerequisites

·Docker Desktop (Windows/macOS) or Docker Engine (Linux)
·Python 3.11+ with a venv at ~/ffe-venv
·Node.js 18+ on PATH
·~2 GB free disk for embedding models

Manual setup in the README →

Questions buyers ask

Can I ingest my own manuals and incident logs? +

Yes. Drop PDFs into data/raw/manuals/, images into data/raw/images/, CSVs for incidents/parts/error codes into data/raw/. Re-run the bulk-ingest script.

How large a corpus does this handle on a laptop? +

The benchmarked corpus is small (3 PDF manuals, 30 incidents, 25 parts, 13 error codes, 6 schematics, 5 voice clips) because it's built for a hackathon. Actian VectorAI DB itself scales to hundreds of millions of vectors per collection; the bottleneck on a 16 GB laptop is embedding throughput, not retrieval. A ten-thousand-page manual corpus is well inside the envelope. Beyond that, move embedding to a GPU box and leave retrieval on the laptop.

Does it need a GPU? +

No. bge-small, CLIP-ViT-B-32, and whisper tiny.en all run acceptably on CPU — the 843ms p50 number is from a 16 GB laptop, CPU-only, WSL2. A GPU helps during bulk ingest but is unnecessary for query-time retrieval.

Does this replace my CMMS? +

No. A CMMS tracks work orders, schedules, spares. FixFirst Edge sits next to it — the retrieval layer a technician opens during the fix. Complementary, not replacement.

What's the security posture for regulated environments? +

Everything runs local. Zero outbound network calls in the diagnose path — verifiable in the browser network panel, or in a DMZ with egress blocked. No API keys, no license server, no telemetry. The Docker compose file is the entire supply chain.

What does it cost? +

Zero. MIT-licensed source, free Actian VectorAI DB for this embedded footprint, and the hardware is whatever laptop the technician already has. A cloud CMMS at $35–$200/user/month costs a ten-person team $4,200–$24,000 per year.

Retrieval that stays on the laptop.

Every competitor in the category assumes the cloud.

Three ways to try it. Pick the one that fits your minute.

One query. Three panels of evidence. 843 milliseconds.

Expand any row for the depth.

One command. Your laptop.

Retrieval that
stays on the
laptop.