Case Study — AI Assistant

A low-latency AI assistant purpose-built for product design workflows.

<500ms
average first-token latency
12
domain-specific tool definitions
2,048
recent messages in windowed context
100%
SSE streaming with zero buffering

The challenge.

Product designers working inside the concept-board editor needed on-demand, context-aware assistance — resizing a workarea, adjusting object dimensions, applying alignment rules, or transforming images — without leaving the canvas. Every switch to an external AI tool broke their flow and lost spatial context.

Existing chat-based AI tools were stateless per-turn, had no awareness of the canvas state, and were impossible to integrate into the editor's undo/transaction system. Designers needed an assistant that could see what they saw, act on specific canvas objects, and produce results that the standard undo stack could roll back.

The specific problems
  • No existing AI integration could reason about Fabric.js canvas state — object positions, dimensions, z-order, visibility
  • General-purpose LLMs had no awareness of the application's tool set, data models, or domain-specific constraints
  • Streaming responses had to be reconciled with real-time canvas mutations — no partial or broken state allowed
  • Multi-turn conversations required maintaining context across undo, redo, and object mutations that happened outside the chat

What was built.

A streaming, tool-calling AI assistant embedded directly inside the Fabric.js concept-board editor, wired into canvas state and the undo/transaction system.

Tool-Calling Architecture
Designed a modular tool-calling layer with 12 domain-specific tool definitions — each mapped to a Fabric.js handler operation (resize workarea, align objects, crop image, toggle visibility, z-order, group/ungroup, measure distance, apply grid snap). Tools accepted structured arguments and returned deterministic results that the front-end could execute via the existing Transaction handler, ensuring every AI-driven mutation was undoable and conflict-safe.
Streaming SSE Pipeline
Built a Laravel SSE endpoint that streamed assistant tokens over Server-Sent Events with zero buffering. The pipeline streamed both text tokens and structured tool-call JSON through the same connection, with the front-end parsing and dispatching tool calls to Fabric.js handlers as they arrived — no waiting for a complete response before seeing canvas changes.
Windowed Context Management
Implemented a sliding-window context system that tracked the last 2,048 messages alongside a condensed session summary. Canvas state snapshots were injected as system context on each turn — object count, workarea dimensions, selected object IDs, zoom level — so the model always knew the editor's exact state without embedding the entire canvas JSON into every prompt.
Undo-Integrated Response Layer
Every AI-generated mutation was recorded as a single compound transaction in the undo stack. If a designer invoked undo after an assistant action, the entire multi-step operation reversed cleanly — no orphaned objects, no misaligned state. The assistant also exposed a "what changed" summary that surfaced in the Transaction handler's UI, so designers could see exactly what the AI had done.

What shipped.

<500ms
average first-token latency — from user query to first visible canvas mutation or text token
100%
of AI-generated canvas operations are undoable — zero orphaned state scenarios
12
domain-specific tool definitions wired into the Transaction handler — every action conflict-safe
2,048
messages in the sliding-window context plus a condensed session summary injected per turn
Single
SSE connection handles both text tokens and tool-call JSON — no dual-stream complexity
Compound
transaction recording means a single undo reverses the entire AI multi-step operation
OpenAI Laravel 10 React 18 Redux Toolkit Fabric.js v7 Server-Sent Events PHP TypeScript

The developer.

Alexander Dudnik
Alexander Dudnik
AI & Full-Stack Engineer

10+ years of commercial experience in backend development, system architecture, and building scalable applications. Specialises in PHP, Node.js, React, PostgreSQL, and message queue architectures. Experienced in technical leadership: task decomposition, estimation, database design, and core system architecture across high-load environments.

Need an AI assistant embedded into your product?

Fixed-price sprints. PM included. First sprint free if we miss scope. Start with Sprint Zero at $2,500 — 2-week diagnostic, money-back guaranteed.