A computer-use agent for software that has no API.
Overview
A UK dental group runs their entire operation on EXACT Multi — a Windows desktop application running on a remote server with no API, no webhooks, and no integration layer. Pulse solves this by deploying Claude as a computer-use agent that operates the software directly via RDP. When a new appointment lands in Airtable, it hits a FastAPI webhook, queues to Redis, and gets picked up by one of three parallel agent instances. Each instance runs its own XFCE desktop, connects via xfreerdp, logs into EXACT, navigates to the appointments calendar, and books the slot. Every session is recorded with ffmpeg. First live campaign: 7 of 7 appointments booked, 100% success rate, $14.05 total spend.
How it works
A new row triggers a POST to the FastAPI /webhook endpoint with the booking payload.
FastAPI validates the payload and pushes to Redis; three agent instances poll with BLPOP.
One Docker instance atomically claims the job via BLPOP and marks status in_progress.
xfreerdp connects to the remote Windows server through the gateway; XFCE desktop loads.
Claude screenshots the screen and decides if EXACT is already logged in, skipping login if so.
Claude navigates the calendar, selects the slot, finds the patient by Alt. Ref., and saves the appointment.
ffmpeg captures the XFCE display to MPEG-TS throughout; remuxed to MP4 and uploaded to Supabase.
Airtable row updated to Booked; Slack message sent with booking details and per-session cost.
Results
Session Log
What was built
Claude driving xfreerdp — Claude claude-sonnet-4-6 runs as a computer-use agent against an XFCE desktop. It takes screenshots, reads the screen state, and emits click/type/keyboard actions to operate EXACT Multi directly. No screen-scraping library; the LLM interprets the UI and decides the next action at each step.
3-instance parallel architecture — Docker Compose runs three independent agent+XFCE pairs, each polling a shared Redis queue via BLPOP. One instance crash is isolated; the others continue processing. Each instance has its own VNC server, noVNC web viewer, and separate RDP credentials for EXACT.
Preflight barrier pattern — every workflow JSON begins with a preflight sequence: the agent screenshots the screen and decides whether the precondition is met (e.g. is EXACT already logged in?). A barrier in the prompt template stops execution if the preflight passes, preventing redundant login or double-navigation steps.
ffmpeg session recording — x11grab captures the XFCE display (:1) to a MPEG-TS file from the moment the booking starts. MPEG-TS is safe to kill mid-stream without corruption. On completion the file is remuxed to MP4 (no re-encode), uploaded to Supabase Storage, and a 30-day signed URL is saved to the booking row.
Token-level cost tracking — every LLM response returns a usage block. Pulse separates billable input, cache-write, cache-read, and output tokens, applying Anthropic's exact rate card per category. Costs are summed per booking and exposed in the dashboard. First campaign: $14.05 across 7 bookings, ~$2/booking.
Resilience layer — three retry attempts per booking, Redis kill-flag watched by a background asyncio task (cancels mid-run via asyncio.cancel()), stale in-progress jobs reset to pending on agent startup, and an orphan detector requeues bookings stuck in pending for over 120 seconds.
Role
Sole engineer · end-to-end build.
Stack