RobotApp

A real-time ops console + FastAPI runtime for ROS2 mobile manipulators. Open-vocabulary perception, live RGB-D streaming, hot-swappable skills — driven from a browser.

Live demo GitHub Read the architecture

production skills

30+

REST + WS endpoints

device transports

execution modes

~20 fps

depth streaming

What it does

RobotApp is the UI tier of a three-mode robot stack — UI, CLI, and Python API all share one runtime (robot_agent). The dashboard turns any robot_agent-based robot (reference: kcare_robot, an assistive 6-DOF mobile manipulator) into a network-addressable agent — driveable from a browser on the other side of the world, deployed globally on Cloudflare Pages.

Live RGB-D streaming

WebSocket bridges to RealSense D405 and Femto Bolt RGB-D cameras. zlib-compressed uint16 depth is decoded in the browser and rendered with a JET colormap. Pixel-hover gives a live millimetre readout.

Open-vocabulary perception

Type apple or the red mug on the table; the backend routes to a TCP VLM service (GroundingDINO + GroundedSAM + mask2grasps) and lifts detections to 6-DOF grasp poses.

Streaming task plans

A single WebSocket carries the full lifecycle: start → plan → step_start → step_log → step_done → done. Inline frames render next to the step that produced them.

Hot-swappable skills

Register new ROS service / topic / action / WebRTC / TCP / LLM clients from the UI. Reload skills with one POST — no robot restart.

Three execution modes

Same skill, three entry points: POST /skill/<name>, kcare_robot pick::apple, or from kcare_robot.skills.pick import pick.

Edge deployment

Static export, served from Cloudflare Pages with a custom domain (robot.aistations.org) — sub-100 ms TTFB anywhere.

A guided tour of the dashboard

The interface is split into four functional zones. Click through them below.

Live RGB / depth camera tabs with FPS counter and capture controls

Multi-camera tabs (head_rgb, head_depth, arm_rgb, arm_depth, log_image) persist their order in localStorage. Drag to reorder. RGB streams as JPEG; depth streams as raw uint16 + zlib, decoded and colorised client-side. Rectangle-draw overlay is wired to detector inputs.

Structured / unstructured agent prompt with language toggle

Two prompt modes — Structured for direct skill calls (moveh::down + key=value params) and Unstructured for natural-language tasks routed through an LLM planner. Language picker (EN / KO / VI). Ctrl+Enter to dispatch.

Plan + execution panel with step results

The LLM plan and step-by-step execution timeline. Each step shows status (✓ done / ✗ failed / ⟳ running), a JSON result, and any log image emitted mid-skill — all streamed over one WebSocket.

Live list of every registered skill, grouped by category (recognition, pick, place, mobile, arm, head, lift…). CRUD on each skill, JSON-edit per-skill configs, hot-reload via POST /skills/reload.

Device connections panel with ping all and ROS scan

Every ROS service / topic / action / WebRTC / TCP / LLM client registered with the robot. Ping All health-checks the bus; Scan ROS discovers nodes via get_ros2_node_names_and_types().

The stack

robotapp · Next.js 14 dashboard TypeScript, Tailwind, WebSocket clients for camera + plan streaming. Static export → Cloudflare Pages.

robot_agent · FastAPI runtime ~4.2 K LOC. SkillRegistry · DeviceManager · UnifiedAgent. 30+ endpoints. Six device transports.

kcare_robot · 6-DOF assistive manipulator 23 production skills. RealSense D405 wrist cam · Femto Bolt head stereo · Nav2 base · KAAIR cobot arm.

robot_template · cookiecutter scaffold Bootstrap a dashboard-ready robot package in 30 seconds. Same contract as kcare_robot.

Architecture at a glance

                ┌──────────────────────────────────────┐
                │  robotapp   ·  Next.js 14 + Tailwind │   browser / Cloudflare Pages
                │  WebSocket + REST client (TypeScript)│
                └────────────────┬─────────────────────┘
                                 │  HTTP + WS
                                 ▼
       ┌──────────────────────────────────────────────────┐
       │  robot_agent   ·   FastAPI                       │   robot host · port 8001
       │  · SkillRegistry · DeviceManager · UnifiedAgent  │
       │  · /ws/camera/{id}   /ws/agent   30+ REST        │
       └────────────┬──────────────────────┬──────────────┘
                    │                      │
                    ▼                      ▼
        ROS2 Humble · rclpy · Nav2     Devices · RealSense D405
        joint_states · /navigate_to_   Femto Bolt · KAAIR cobot
        pose · /kaair_worker/*         TCP VLM service (GPU)

Want the full diagram with sequence flow? Read the architecture deep dive →

Demo

Record a screen capture of the dashboard and drop it at docs/public/demo.mp4 (or demo.gif) — it will appear here automatically.

Contact

Trung Bui · Robotics + Vision Engineer

Built RobotApp, robot_agent, kcare_robot, and robot_template end-to-end. Open to roles in robotics, computer vision, and full-stack AI infrastructure.

bmtrungvp@gmail.com · github.com/mtbui2010 · More about this project →