Skip to content
RobotApp dashboard showing live RGB-D camera feed, task plan execution, and skill registry.

RobotApp

A real-time ops console + FastAPI runtime for ROS2 mobile manipulators. Open-vocabulary perception, live RGB-D streaming, hot-swappable skills — driven from a browser.
23
production skills
30+
REST + WS endpoints
6
device transports
3
execution modes
~20 fps
depth streaming

What it does

RobotApp is the UI tier of a three-mode robot stack — UI, CLI, and Python API all share one runtime (robot_agent). The dashboard turns any robot_agent-based robot (reference: kcare_robot, an assistive 6-DOF mobile manipulator) into a network-addressable agent — driveable from a browser on the other side of the world, deployed globally on Cloudflare Pages.

Live RGB-D streaming

WebSocket bridges to RealSense D405 and Femto Bolt RGB-D cameras. zlib-compressed uint16 depth is decoded in the browser and rendered with a JET colormap. Pixel-hover gives a live millimetre readout.

Open-vocabulary perception

Type apple or the red mug on the table; the backend routes to a TCP VLM service (GroundingDINO + GroundedSAM + mask2grasps) and lifts detections to 6-DOF grasp poses.

Streaming task plans

A single WebSocket carries the full lifecycle: start → plan → step_start → step_log → step_done → done. Inline frames render next to the step that produced them.

Hot-swappable skills

Register new ROS service / topic / action / WebRTC / TCP / LLM clients from the UI. Reload skills with one POST — no robot restart.

Three execution modes

Same skill, three entry points: POST /skill/<name>, kcare_robot pick::apple, or from kcare_robot.skills.pick import pick.

Edge deployment

Static export, served from Cloudflare Pages with a custom domain (robot.aistations.org) — sub-100 ms TTFB anywhere.

A guided tour of the dashboard

Full dashboard screenshot

The interface is split into four functional zones. Click through them below.

Live RGB / depth camera tabs with FPS counter and capture controls

Multi-camera tabs (head_rgb, head_depth, arm_rgb, arm_depth, log_image) persist their order in localStorage. Drag to reorder. RGB streams as JPEG; depth streams as raw uint16 + zlib, decoded and colorised client-side. Rectangle-draw overlay is wired to detector inputs.

The stack

Architecture at a glance

┌──────────────────────────────────────┐
│ robotapp · Next.js 14 + Tailwind │ browser / Cloudflare Pages
│ WebSocket + REST client (TypeScript)│
└────────────────┬─────────────────────┘
│ HTTP + WS
┌──────────────────────────────────────────────────┐
│ robot_agent · FastAPI │ robot host · port 8001
│ · SkillRegistry · DeviceManager · UnifiedAgent │
│ · /ws/camera/{id} /ws/agent 30+ REST │
└────────────┬──────────────────────┬──────────────┘
│ │
▼ ▼
ROS2 Humble · rclpy · Nav2 Devices · RealSense D405
joint_states · /navigate_to_ Femto Bolt · KAAIR cobot
pose · /kaair_worker/* TCP VLM service (GPU)

Want the full diagram with sequence flow? Read the architecture deep dive →

Demo

Record a screen capture of the dashboard and drop it at docs/public/demo.mp4 (or demo.gif) — it will appear here automatically.

Contact

Trung Bui · Robotics + Vision Engineer

Built RobotApp, robot_agent, kcare_robot, and robot_template end-to-end. Open to roles in robotics, computer vision, and full-stack AI infrastructure.

bmtrungvp@gmail.com · github.com/mtbui2010 · More about this project →