How I Reverse Engineered Fitbod Using Claude Code

Firstly, I love the product. I have been using Fitbod for two years. Last year I was in their top 10% of users for completed workouts, as per their 2025 workout report published in the app. I was a firm believer that the effort of tracking workouts on an app is just too much, but the great UX of this app made me believe otherwise. The UX makes it feel like a natural thing, not an extra effort. These guys are doing some really amazing work.

Since I use this app so frequently, I wanted to make a few modifications to it that would only make sense for me. Customisations. Recently I have been using Claude Code a lot. So I decided to build an exact replica with those customisations.

The Idea

Instead of guessing at the design and features, I wondered: could I use Claude Code to systematically extract everything from the app itself — every screen, asset, string, algorithm, and data model?

The answer was yes. Here’s how it went.

What Claude Code Actually Did

I didn’t write a single line of code myself. I described what I wanted — a depth-first autonomous crawl of every screen, interaction, and asset in the app — and Claude Code designed the approach, wrote the tools, ran them, observed failures, and fixed them — all in a single long session.

The first thing Claude Code built was a Swift CLI tool (FitbodAX) using macOS’s Accessibility API to programmatically drive the app. The first version was simple: traverse the AX element tree, tap buttons, take screenshots. That broke immediately — wrapped iOS apps have a degraded AX tree where most interactive elements show up as AXStaticText, invisible to any button-press logic. Claude Code diagnosed this from the error output, added a position map that records bounding boxes for every element regardless of role, and implemented a CGEvent-based fallback that clicks by coordinates instead of AX actions. That pattern — fail, diagnose, fix — repeated throughout the session.

Beyond the crawler, Claude Code wrote an asset extractor using Apple’s private CoreUI.framework to pull images from compiled .car files, a Core Data model parser for the database schema, shell scripts to convert binary .strings and .plist files to JSON, and a Cytoscape.js interactive viewer for the navigation graph.

I’d check back after stepping away and see it had discovered 20 new screens since I last looked. That was the whole dynamic — I set direction, Claude Code did the work.

The Feedback Loop

The crawl ran in passes. After each one I’d say:

“Run the crawl and find what we missed.”

Claude Code would run the tool, read the output, and identify gaps — a tab that never got tapped, a screen behind a long scroll that was never reached, a modal that appeared only after a specific interaction sequence. It would fix the algorithm, run again, and report new screens discovered.

After each pass I’d ask:

“What changes would you make to the algorithm to make it more exhaustive?”

This produced four written algorithm revisions, each addressing the concrete failures from the previous run. Screen count across passes: 143 → 166 → 176 → 187 → 195.

Moments That Stood Out

The tap-label command. Standard AXPress was silently failing on segmented controls and tab bar items. Claude Code added a command that finds elements by display label regardless of AX role, then clicks the center of their bounding box using coordinates. That one addition unblocked about 30 screens that were previously unreachable.

Extracting from Assets.car. I asked if image assets could be extracted. The standard tool (assetutil) only gives metadata, no pixels. Claude Code found that CoreUI.framework — a private macOS framework — contains the class the OS itself uses to read asset catalogs. It loaded the framework at runtime, found the right method via the ObjC runtime, and wrote a script that called it. 669 PNG files out of 20 .car files across the app bundle.

The “Swap” button. It was visible on the workout screen for 3 full crawl passes and was never tapped. No error, no warning — just a gap. The v4 algorithm revision added an interaction ledger specifically to catch this: after exploring a screen, enumerate every visible interactive element and verify each one has a recorded interaction. The Swap button was the concrete example that motivated it.

The algorithm training data. I didn’t ask Claude Code to look for this — it found it while cataloging the bundle. Inside the app alongside the exercise database were several CSVs containing quantile tables, model outputs, and initial weight recommendation data used to power workout generation. It was fascinating to see how much of the recommendation logic ships as bundled data rather than server-side computation.

What Came Out the Other End

Five hours of Claude Code running nearly autonomously produced:

195 screens documented with full AX trees, screenshots, and UI element inventories.
182 navigation transitions in a queryable graph across 20 user flows.
471 image assets extracted and categorized — muscle group illustrations, equipment photos, icon sets, all 11 alternate app icons.
4,525 localization strings — every UI label, exercise name, and step-by-step instruction.
821 exercises with muscle groups, equipment requirements, categories, and aliases.
The full database schema — 19 Core Data entities, 204 properties, 28 migration versions showing the evolution of the data model.
30 font files, audio files, haptic patterns, Lottie animations, and the onboarding video.

Total: 1,598 files, 72MB — the complete blueprint of the app.

To be clear — this was a personal learning exercise. I’m not using Fitbod’s proprietary assets, fonts, or data in anything I ship. The goal was to understand how a product I admire is built, and to see how far Claude Code could go as an autonomous reverse engineering tool. Everything I build from here uses original assets and my own design decisions.

The Crawl Algorithm

If you’re curious about the exact prompt and algorithm I used to drive the crawl, here it is — the original prompt I started with, and the final v4 it evolved into.

The Starting Prompt

Original crawl prompt (v1)

Restart the application before beginning to ensure a clean, known starting state. Bring it to the foreground, take a screenshot, and dump the full accessibility tree of the front window.

Depth-first crawl. On each screen, do three things before moving on:

Catalog elements. Query the accessibility tree for every interactive element — buttons, menus (full menu bar and all submenus), toolbar items, tabs, sidebar items, list and table rows, text fields, toggles, disclosure triangles, pop-up buttons, context menus (right-click major areas), anything with AXPress or AXOpen actions.

Understand the screen. Take a screenshot and analyze it visually. Determine what this screen’s purpose is, what entities or data it displays, what capabilities it exposes (create, edit, filter, sort, search, export, share, etc.), what modes or states it’s in, and what fields, forms, and controls are present with their labels, types, and constraints.

Interact and observe. For every element and input on the screen:

Navigation elements (buttons, links, tabs, sidebar items, menu items) — click, wait for UI to settle, screenshot, dump the tree, compare. If it’s a new screen, record the node and edge and immediately recurse into it before returning. If it’s a known screen, record the edge and move on. Going depth-first means you’re always one step from your previous state so backtracking is just closing the modal, hitting back, or switching the tab.
Input fields and forms — take a screenshot of the form or screen, visually interpret what each field expects based on labels, placeholders, grouping, and context, then generate a test matrix of sensible permutations: valid data, empty submission, invalid formats, boundary values, special characters, mismatched types, and edge cases specific to what the field represents (past dates for a birthdate field, duplicate names for a create-new form, etc.). Execute each permutation and for every attempt record validation messages, inline errors, fields highlighted, buttons that enable or disable, toasts or banners that appear, and whether submission succeeds or is blocked.
Toggles, checkboxes, dropdowns, segmented controls — try every option and state. After each change, screenshot and diff the full screen. Record sections that appear or disappear, fields that enable or disable, options that change in other controls, and any new capabilities or UI regions that are revealed. Track these as conditional edges labeled with the precondition.
Multi-step flows (wizards, onboarding, steppers) — play through completely. At each step, screenshot, interpret what choices are available, and explore every branch. If selecting one option at step 2 leads to different fields than another option, explore both. Map the full branching sub-graph.
Destructive actions (delete, remove, clear, reset) — execute them. Record the confirmation dialogs, warning messages, and resulting state changes. After the destructive action completes, record what was removed from the screen, what downstream screens were affected (item gone from lists, counts updated, empty states triggered), and what undo or recovery options are available if any. If the destructive action puts the app in a state where you need data to continue exploring (you deleted the only item in a list and now other features are inaccessible), recreate the data before continuing.

After every single interaction, observe and record all of the following: UI changes such as elements appearing, disappearing, changing state, moving, or new sections revealed. Visual feedback such as toasts, banners, spinners, highlights, shake animations, or color changes. State changes such as buttons enabling or disabling, badge counts updating, progress advancing, or modes switching. Feature unlocking such as new menu items, toolbar buttons, sidebar entries, or screen sections that weren’t available before. Flow changes such as redirects, modals opening, new windows spawning, or navigation guards blocking you. Downstream effects — after any significant action, navigate to related screens and check what changed there: does the new item show in the list, did the count update, did a notification appear, does the setting actually take effect.

Guardrails: Ask me for credentials if you hit a login screen. Cap depth at 30, retry failed interactions twice then skip and note it. Track visited screens by a composite of window title, major UI element set, and modal state to avoid cycles.

Final Algorithm (v4)

View full v4 algorithm

Restart the application before beginning to ensure a clean, known starting state. Bring it to the foreground, take a screenshot, and dump the full accessibility tree of the front window.

Initialize these data structures before the first interaction:

Visited set: a map of screen identity hashes to screen IDs.
Navigation stack: a stack of {screenId, backElement} tuples for backtracking, where backElement stores {label, role, cx, cy} of the element that will return you.
Element position map: rebuilt on every screen, maps every AX element to its bounding box and available actions.
Interaction ledger: per-screen tracker of which elements have been interacted with vs. total interactive elements.
Deferred queue: screens that were discovered but not fully explored, to be revisited at the end.
Baseline snapshot: screenshot + AX tree + element map of the initial screen after restart, used to detect state drift.

State validation on start. Compare the baseline against the expected default state (if known from a previous crawl). If differences are detected (wrong location, changed settings, non-default configuration), flag them before proceeding. Offer to reset the app state or continue with the current state documented.

Screen Identity

A screen is identified by a hash of:

Window title.
Top 5 unique AX labels in the tree, sorted alphabetically.
Whether a sheet, modal, or alert is present (AXSheet in tree, or sheet grabber element).
Whether an AXWebArea is present (webview content).
The currently selected tab (if a tab bar exists, which AXRadioButton is selected).
Scroll region: top, middle, or bottom (based on whether scrolling reveals new elements).

Hash command. The CLI tool must have a hash subcommand that computes and prints the screen identity hash from the current AX tree state. Run this before every screenshot. Output format:

Hash: abc123 [NEW]
Hash: abc123 [DUPLICATE of workout_06_add_exercise]

Before capturing any screen, compute its identity hash. If the hash is already in the visited set, record the transition edge and any alternate-route metadata, then do not recurse. If it is new, add it to the visited set and proceed with the full per-screen routine.

Alternate route detection. When a hash matches an existing screen but the arrival path is different (different source screen or different element), record the transition edge with "alternate_entry": true. Also compare the back-navigation element — if it differs from the original capture, record both.

Element Discovery

Do not trust AX roles to determine what is interactive. After dumping the AX tree:

Build the element position map. For every element in the tree regardless of role, query AXPosition and AXSize. Store {role, label, centerX, centerY, width, height, actions} where actions comes from AXActionNames.

Classify elements by behavior, not role. Apply these rules using both the AX tree and a visual analysis of the screenshot:

Visual pattern	AX role (often wrong)	Actual behavior
Text row with chevron ”>“	`AXStaticText` inside `AXGroup`	Tappable navigation link
Circle with checkmark or empty	`AXImage` (“radio.empty” / “selected”)	Radio button
Horizontal text peers (“Results” / “Recovery”)	`AXStaticText`	Segmented control tab
Text with toggle switch beside it	`AXCheckBox`	Toggle
Row with ”…” or overflow icon	`AXButton` (unlabeled)	Context menu trigger
Sheet grabber bar	`AXButton` with value “Half screen” / “Full screen”	Draggable sheet handle
Element inside `AXWebArea`	Various DOM roles	Web content — use webview strategy
Button with numeric value (%, count)	`AXButton` with value attribute	State-display element, not navigation
Dropdown arrow icon next to text	`AXButton` labeled “droparrow”	Picker control

Classify each element’s interaction type. Before interacting, predict what will happen:

Classification	Heuristic	Interaction strategy
Navigation element	No value attribute. Label is a noun/destination. Has “Forward” image sibling.	Tap and recurse depth-first.
State element	Has numeric value (percentage, count, “0”/“1”). Is `AXCheckBox`.	Toggle, record change, restore original. Do not recurse.
Action element	Label contains verbs: Send, Export, Delete, Connect, Sync, Add, Save, Log Out.	Tap once, capture result. Do not recurse.
Picker element	`AXPopUpButton`, or `AXButton` labeled “droparrow” adjacent to a field label.	Open, exhaust options, close without changing.
Webview trigger	Labeled “Learn More”, “Help”, ”?”, or `AXButton` with help icon.	Tap, apply webview strategy, close.

Navigation element priority. Within Phase 3, process navigation elements in this order:

Elements with a “Forward” image sibling (highest confidence of new screen).
Elements with a chevron ”>” visible in the screenshot.
Remaining elements classified as navigation.

Tapping Strategy

Never guess coordinates as raw percentages. Use this priority order:

AXPress via role+label: Use the press command to find and press an AXButton by its label. Most reliable when the target is unambiguously an AXButton.
Position-map tap with constraints: Use the element position map to get exact center coordinates. When the label is ambiguous, disambiguate with:
- Role filter: tap-label <label> --role AXRadioButton
- Region filter: tap-label <label> --region bottom
- Nearest filter: tap-label <label> --nearest cx,cy
Stored-coordinate tap: Use exact {cx, cy} from a previously computed element map entry. Use for backtracking and webview-adjacent controls.
Visual coordinate tap: Last resort when the element is not in the AX tree.

For tightly packed rows: use the position map’s bounding boxes to verify the target element’s exact Y range before tapping. If the target row is only 40px tall, a 20px miss will hit the wrong row.

Label collisions in webviews. When an AXWebArea is present, never use bare tap-label for UI controls. Always use --role or --region constraints, or stored coordinates from the pre-webview element map.

Per-Screen Routine

On each screen, execute these four phases in order. Do not move on until all four are complete.

Phase 1: Catalog and Understand

Compute screen identity hash. If hash matches a visited screen, record transition edge and return without recursing.
Take a screenshot.
Dump the full AX tree as JSON.
Build the element position map.
Initialize the interaction ledger. Count every element with AXPress in its actions and a non-empty label. Set interacted to 0.
Detect back-navigation element. Priority order: element labeled “Back” or “Left” with role AXButton; element in top-left quadrant (x < 25%, y < 15%); element whose label matches the previous screen’s title; the tab bar button for the originating tab.
Analyze the screenshot visually: purpose, entities displayed, capabilities exposed, modes/states, forms and controls.
Exhaust scroll content. Scroll down in 80% viewport increments, dump the AX tree and diff after each scroll. Stop when unchanged after 2 consecutive scrolls. Scroll back to top before proceeding.
Detect animations. If animation assets are found or two screenshots taken 1 second apart differ visually, mark the screen as "animated": true.

Phase 2: Test Controls and Forms (non-navigating)

Work through every control on the current screen that does not navigate away.

Text fields — test valid data, empty submission, boundary values, special characters, domain-specific edge cases. Record validation messages, inline errors, toasts, whether submission succeeded.
Toggles, checkboxes, radio buttons — toggle each option. Screenshot and diff. Restore the original value before continuing.
Segmented controls — tap each segment. Screenshot and diff. Return to original.
Dropdown and picker controls — mandatory. Tap to open. Exhaust the picker range (scroll to limits, record first and last values). Close without changing selection. Record in the screen JSON under a pickers field.

Phase 3: Navigate and Recurse (depth-first)

Work through every navigation element in priority order. Before each interaction: record the current screen identity and push the detected back-navigation element onto the navigation stack.

Classify the result after each interaction:

Classification	Detection	Recording
Full navigation	Entire content changed. Back button appeared.	Compute hash. If new: record as node + edge, recurse. If duplicate: record edge only.
Modal / bottom sheet	Sheet grabber present, or parent elements still in tree behind sheet.	Record as node + edge `type: sheet`. Try expanding to full screen. Recurse.
Dialog / alert	`AXSheet` with `description: "alert"`, or system-style dialog.	Record buttons. Choose non-destructive option to dismiss.
System dialog	AX tree unchanged but screenshot shows overlay.	Record as `type: system_dialog`. Dismiss with Escape.
External link	AX tree unchanged. New browser window detected.	Record as `type: external_link`. Do not recurse.
Inline expansion	Same screen, new elements appeared.	Record as state variant of current screen. Update ledger.
No change	AX tree unchanged, no new windows, no screenshot difference.	Log as `type: no_op`. Skip. Do not retry.
Web content	`AXWebArea` appeared in tree.	Switch to webview strategy.

Every new screen gets the full per-screen routine. Taking a screenshot and closing is not valid completion.

Backtracking. After recursing and completing a screen: pop the navigation stack, use the stored backElement coordinates to tap back (do not re-search by label — avoids collisions in webviews), verify return by comparing the current screen’s identity hash.

Phase 4: Completeness Audit

After Phases 1–3, before leaving the screen:

Compare the interaction ledger: interacted vs total_interactive.
Compute coverage = interacted / total_interactive * 100.
If coverage is below 90%, list all uninteracted elements.
For each uninteracted element: interact with it now (if it has AXPress), mark as skipped: duplicate, or mark as skipped: non-interactive.
Record the final coverage in the screen JSON.

"coverage": {
  "total_interactive": 15,
  "interacted": 14,
  "coverage_percent": 93.3,
  "skipped": [
    {"label": "delete (x7)", "reason": "duplicate — same behavior as first instance"}
  ],
  "untapped": []
}

Only after the audit passes may the crawl leave this screen.

Webview Content Strategy

When an AXWebArea is detected:

Record the exact {cx, cy} of every control element outside the webview (close button, back button, tab bar). These are the webview exit controls — the only safe way to interact with non-web UI while the webview is active.
Dump the full element map. AXStaticText and AXHeading elements inside the AXWebArea give you the article’s structured text content.
Take sequential screenshots while scrolling in 80% viewport increments.
Record outbound links (AXLink elements) as potential edges. Follow to one level of depth maximum.
To exit: use stored exit control coordinates. Verify AXWebArea is no longer in the AX tree.

State-Changing and Destructive Actions

State-changing actions — record current value, perform action, undo, verify restoration.

Destructive actions — record what exists, execute, record what was removed and what downstream screens changed, recreate data if necessary to continue exploring.

Action elements (Send, Export, Connect, Log Out) — tap, capture result, dismiss with non-destructive option. Record outcome without actually completing the action.

Multi-step flows — play through entirely, exploring every branch. After completing, note any state transitions and continue crawling from the new state.

State drift check at end. After the crawl completes, return to the initial screen and compare against the baseline snapshot.

Deferred Queue Processing

After the main depth-first crawl completes:

For each deferred screen, navigate to it using the stored path.
Run the full per-screen routine (Phases 1–4).
If a deferred screen leads to further new screens, explore those immediately.
Continue until the deferred queue is empty.

Guardrails

Ask the user for credentials if you hit a login screen.
Cap recursion depth at 30 levels.
Retry failed interactions twice, then skip and log the failure.
If a single screen’s full routine takes longer than 10 minutes, log a warning and add the screen to the deferred queue.
If the app crashes, relaunch, navigate back using the navigation stack, and continue.
Save a checkpoint every 20 screens: visited set, navigation stack, interaction ledgers, deferred queue, captured screens.

Coverage Report

After the crawl completes, generate data/coverage_report.json:

{
  "crawl_timestamp": "ISO8601",
  "total_screens": 187,
  "total_transitions": 175,
  "total_interactive_elements": 1250,
  "total_interacted": 1180,
  "overall_coverage_percent": 94.4,
  "screens_by_coverage": {
    "100%": 155,
    "90-99%": 22,
    "80-89%": 7,
    "below_80%": 3
  },
  "low_coverage_screens": [
    {
      "id": "body_01_main",
      "coverage_percent": 62,
      "untapped": [
        {"label": "Body Fat Percentage", "role": "AXButton", "classification": "navigation"}
      ]
    }
  ],
  "deferred_not_completed": [],
  "duplicate_screens_found": [
    {"hash": "abc123", "ids": ["workout_06_add_exercise", "workout_create_from_scratch"]}
  ],
  "alternate_routes": [
    {"screen": "log_11_workout_detail", "routes": ["calendar date tap", "past workouts list tap"]}
  ]
}

The crawl is not complete until: every screen has coverage ≥ 90% (or all untapped elements are explicitly skipped with reason), the deferred queue is empty, and state drift is zero or documented.