All Work

Accessibility · AI · Native iOS

Fathom

An AI-powered indoor navigation app that gives blind and low-vision people independence in every building, from the first visit — designed by someone who lives the problem, built with agentic workflows that turned a solo designer into a shipping team.

RoleCreator · Designer · Builder
CompanyIndependent
Timeline6 weeks — concept to TestFlight (Feb–Mar 2026)
TeamSolo — design, research, engineering, product (with Claude Code as agentic pair)
23K+Lines of production Swift
6 wkConcept to TestFlight
3 modesLookout · Snapshot · Go
83Source files across the full stack

The Work

253 million people worldwide live with visual impairment. GPS-powered navigation tools work outdoors, but they go silent the moment you step inside a building. Hospitals, airports, office buildings, transit stations — the indoor spaces where wayfinding matters most are exactly where existing tools fail. The alternatives are asking strangers, relying on sighted guides, or simply avoiding unfamiliar places entirely.

Fathom is a native iOS app that turns the phone camera into an AI-powered navigation companion — one that watches silently, speaks only when it matters, and guides users turn-by-turn through any indoor space without pre-loaded maps, installed beacons, or help from another person. It works anywhere, on the first visit.

This isn't a product I designed from the outside. I have Leber's Hereditary Optic Neuropathy — a condition that caused significant vision loss starting in childhood. I've navigated the world with impaired vision for most of my life. That lived experience is the foundation every design decision sits on. It's also why I refused to treat this as a hackathon project or a proof of concept. The people who need this deserve the same design rigor I'd bring to any enterprise product — more, actually, because the cost of getting it wrong isn't a bad quarterly metric. It's someone walking into a glass door.

The Problem

Indoor navigation for blind users is an unsolved problem not because the technology doesn't exist, but because previous approaches all require something that isn't there. Some apps need pre-built indoor maps that most buildings don't have. Others require Bluetooth beacons or infrastructure investment from building operators. Sighted-guide apps work, but they create social dependency — you need another human available every time you walk into a new building.

The design challenge was harder than the technical one. Any solution that adds cognitive load — constant narration, alarm-like alerts, complicated mode switching — would be abandoned within a week. Blind and low-vision users have developed sophisticated mental models for navigating the world. They don't need an app that tries to replace their existing skills. They need one that fills the specific gaps their cane and spatial memory can't cover: reading signs they can't see, detecting hazards above cane height, and providing turn-by-turn directions in spaces they've never been.

And the spectrum of visual impairment is wide. Someone with tunnel vision and someone with no light perception have fundamentally different needs from the same app. Any solution that treats blindness as binary — you can see or you can't — will fail half its users.

The most important thing a guide can do is stay quiet when everything is fine. Trust builds in the silence.

How I Worked

Before writing a line of Swift, I spent time in blind and visually-impaired communities — online forums, accessibility advocacy groups, conversations with people who navigate indoor spaces daily with a white cane or guide dog. I wasn't looking for feature requests. I was trying to understand the mental models: how do experienced blind navigators build spatial awareness? What information do they actually need versus what sighted designers assume they need? Where does confidence break down?

Community-first research

Three insights from the blind and low-vision community shaped everything. First, experienced navigators don't want a running commentary — they want a system that behaves like the best human guide: present, attentive, and quiet until something matters. That became Lookout mode's core principle: silence means safety. Second, trust is directional — users need to trust what the app says, but they also need to trust what it doesn't say. An app that cries wolf destroys confidence faster than one that misses an occasional obstacle. The false-positive tolerance is near zero for this audience. Third, the full spectrum of visual impairment demands that every interaction have visual, audio, and haptic channels — no information conveyed through a single channel alone.

System prompts as design artifacts

In a product where the AI's voice is the primary interface, the system prompts aren't engineering details — they're the most important design artifact in the project. I iterated on them the way I'd iterate on a component library: with explicit rules, anti-patterns, and personality guidelines. The Lookout prompt specifies exactly what triggers speech (steps, head-height obstacles, collision-course people) and what doesn't (walls, furniture not in the path, sounds the user can hear themselves). It explicitly forbids the AI from narrating safe, clear walking. The Snapshot prompt structures descriptions spatially — space type first, then ahead, left, right, landmarks, signage — so blind users build a consistent mental map regardless of the environment.

Mode architecture through user feedback

Early concepts had five modes. Community feedback compressed them to two modes (Lookout and Go) plus one embedded capability (Snapshot). The reduction made the app learnable in under a minute — critical for users who experience it entirely through audio and haptics. Snapshot lives inside both Lookout and Go as an on-demand capability — tap once or press the iPhone's Action Button to capture the scene and get a spatial description, without switching modes. The mode architecture was a design decision driven by how blind users actually navigate, not a technical decision driven by how the AI works.

FathomUI — designing beyond the visual

FathomUI is the design system I built for this project, extended from SonarUI. The visual layer uses a warm grayscale palette with no pure black or white — because pure extremes cause halation for low-vision users. Contrast ratios exceed WCAG AAA on primary surfaces. But the real design work was extending the system into sonic and haptic languages. Every mode transition has a distinct sound. Every hazard severity level maps to a haptic intensity. Silence is a defined state — no ambient sound when Lookout is active and the path is clear. The app is equally complete whether you're experiencing it through sight, hearing, touch, or any combination.

Agentic building — AI compressed the cycle, not the rigor

I built 23,000+ lines of production Swift in six weeks, working solo, using Claude Code as an agentic coding partner. The workflow: I wrote the PRD, feature specs, and system prompts — the design artifacts that defined what the product needed to be. Then I worked with Claude Code to scaffold the architecture, implement services, and iterate on Swift code in real time. When I hit technical constraints (Gemini's 10-minute session limit, ARFrame memory leaks, VoiceOver conflicting with AVSpeechSynthesizer), we debugged together — I described the user-facing symptom, Claude diagnosed the technical root cause, and we fixed it in the same session. The rigor didn't decrease — it shifted. Instead of producing static deliverables for handoff, I spent that time testing real working software with real people.

Real-device testing and architectural review

Every feature was tested on a physical iPhone 15 Pro — LiDAR step detection calibrated through actual stairwells, VoiceOver compatibility verified across every screen, haptic patterns tuned for one-handed grip with the phone facing forward at chest height. I conducted a full architectural review of the core systems (Gemini integration, ModelRouter, LiDAR depth service, system prompts) to identify state consistency bugs and memory leaks before the TestFlight pilot. The same quality bar I'd hold for any enterprise product — applied to a solo project because the stakes for this audience demand it.

What We Built

Fathom operates across three modes that share a single design language — visual, sonic, and haptic. Each mode serves a distinct navigation need, but they're designed to feel like one continuous experience.

Lookout — silence as interface

Lookout is the heart of Fathom. The AI continuously analyzes the camera feed at optimized frame rates but speaks only for hazards: steps, head-height obstacles, collision-course people, floor surface changes. Concise alerts in 3–5 words using clock-face directions. Everything else is silence. On-device LiDAR provides a second safety layer — detecting step-downs and elevation changes with depth sensing that works independently of the cloud AI model, using 6-frame temporal filtering to eliminate false positives. Users learn to trust the quiet within minutes. If the app isn't talking, the path ahead is clear.

Snapshot — on-demand spatial context

One tap from any mode triggers Snapshot. The AI captures a photo and delivers a structured spatial description: space type, what's ahead, left and right using clock positions, landmarks, signage, and people. Distances in footsteps for close objects, feet for farther ones. Snapshot includes an adaptive verbosity system — quick, standard, or detailed — with automatic tier upgrades in unfamiliar environments and diff mode for rapid re-triggers that only describes what's changed. It's accessible via the iPhone's Action Button through a custom AppIntent, so users can trigger it without navigating the UI.

Go — turn-by-turn indoor wayfinding

Speak a destination and Go guides you there. An OCR pipeline reads room numbers and signs as you pass them, turns are pre-announced 15 feet before junctions, and arrival is confirmed with door-handle position and threshold details. When Gemini is unavailable, a local fallback provides navigation cues every 8 seconds using on-device ML detections and OCR. The arrival flow requires user confirmation before declaring success — because getting arrival wrong would destroy the trust the entire product is built on.

Session bridging — invisible continuity

Gemini Live API has a 10-minute session limit. Fathom rotates sessions at the 9-minute mark, carrying accumulated context and landmark memory into the new session via structured system prompts. The user never knows it happened. Landmarks described in minute 3 are still referenced in minute 15. The modular AI abstraction layer (AIVisionProvider protocol) means the app can swap models through configuration — so as better vision models ship, Fathom improves automatically without code changes.

What Changed

Fathom is currently in active testing on physical devices and preparing for a TestFlight pilot with blind and low-vision testers recruited through accessibility community organizations. The pilot will validate core Lookout and Go flows against defined exit criteria: 80%+ task completion without assistance, crash-free rate above 95%, and VoiceOver compatibility confirmed across multiple devices.

The project has proven its core technical thesis: real-time AI vision through a phone camera, combined with on-device depth sensing, can provide useful hazard detection and spatial awareness for blind indoor navigation — with latency under 2 seconds from camera frame to spoken alert.

Beyond the product itself, Fathom demonstrates that agentic design workflows can produce production-quality native software at a pace that was previously impossible for a solo practitioner. The same designer who wrote the PRD, designed the system prompts, built the design system, and conducted the community research also shipped 23,000 lines of Swift — not by cutting corners on design rigor, but by using AI to compress the build cycle while keeping people at the center of every decision.

What I Learned

Lived experience is research, not bias

Having LHON doesn't make me the user — the spectrum of visual impairment is vast, and my experience is one data point. But it gives me a calibration that no amount of secondary research can replicate. I know the difference between an interface that looks accessible and one that actually works when you can't see it clearly. That calibration made every community conversation more productive because I could ask better questions.

Silence is the hardest thing to design

It's counterintuitive — the most important feature of a product for blind users is knowing when not to speak. Every instinct says give them more information. The community taught me that confidence grows in the quiet. Getting the silence right required more iteration than any visual component I've ever designed.

System prompts are the new component library

In an AI-native product, the system prompt is the primary design surface. I iterated on Fathom's prompts the same way I'd iterate on a design system — with explicit rules, anti-patterns, and personality specifications. The Lookout prompt's "what does NOT trigger speech" section is more important than the "what does" section. Defining the negative space of AI behavior is a design skill that barely existed two years ago.

Agentic workflows amplify rigor, they don't replace it

The speed came from AI. The quality came from the same practices I'd use on any product: a thorough PRD, detailed feature specs, architectural reviews, real-device testing, community feedback loops. Claude Code let me ship faster — but the reason the product is good is because I did the design work first. The artifacts that drive good product outcomes became more valuable in an agentic workflow, not less, because they're the structured context the AI needs to produce useful output.

Design for the full spectrum, not the average

Fathom works for someone with no light perception and someone with 20/200 acuity — not by offering separate modes, but by ensuring every interaction has visual, audio, and haptic channels. The low-vision mode scales touch targets to 56pt and increases contrast, but it doesn't change the information architecture. The app is the same app for everyone. That's what accessibility-first means.

Build the thing that lets you swap the thing

The AI abstraction layer was more work upfront but it means Fathom isn't a Gemini app — it's a navigation app that happens to use Gemini today. When a better model ships next month, it's a configuration change. Products built on AI need structural independence from any single provider to survive the pace of change in this space.