Your hands
become words.
HandScript converts live American Sign Language into text in real time — powered by a 250-class TFLite model running entirely in your browser.
Performance at a glance
How it works
When you open Together and allow camera access, MediaPipe Holistic maps 21 hand landmarks, 33 body keypoints, and 468 facial landmarks — all locally, in real time. These landmarks feed a TFLite Lite model that classifies the current sign into one of 250 ASL vocabulary items. Predictions stream to the text area as rolling confidence windows settle above the threshold.
The model was trained on a large multi-signer dataset with augmented lighting, hand angle, and signing speed variations. It was quantized post-training using int8 dynamic-range quantization for WebAssembly deployment without accuracy loss.
Use cases
Deaf professionals sign while colleagues read live captions. No interpreter scheduling required for impromptu conversations.
Students practice vocabulary with instant feedback. Teachers verify correctness without pausing class flow.
Patients communicate with intake staff. HandScript transcripts attach directly to consultation notes.
Signers produce auto-captioned social media content without post-production transcription tools.
Technical architecture
HandScript runs a three-stage pipeline: landmark extraction via MediaPipe Holistic (WebGL backend), temporal smoothing with a 15-frame rolling window, and TFLite inference using the WebAssembly runtime. The model is a compact CNN-LSTM hybrid: a MobileNetV3-Small backbone extracts spatial features per frame, and a 2-layer LSTM models temporal dynamics across the window.
The entire model weighs 4.2 MB (int8 quantized) and loads in under 800ms on a mid-range laptop. No GPU is required — CPU inference is fast enough for 30fps input streams.
Try HandScript now
Open the English dashboard and select Sign to Text. No account required for a demo.
Open Dashboard Book a Demo