Resources

Research,
datasets & discoveries.

Everything that powers Together — open datasets, model architectures, case studies, and the research that inspired us.

Datasets used

WLASL

Word-Level American Sign Language — 2000+ classes, 21K+ videos, 119 signers. Used as the primary ASL pre-training corpus. Li et al., 2020.

MS-ASL

Microsoft American Sign Language Dataset — 1000 classes, 25K clips. Used for vocabulary expansion and transfer learning. Joze & Koller, 2019.

ArSL-2018

Public Egyptian Arabic Sign Language dataset — 32 classes, 54,049 images, 40 signers. Foundation for ArSL fingerspelling recognition. Latif et al., 2018.

Together Internal

3,000+ proprietary clips across 20 ArSL signs, 12 ASL signers. Collected under IRB-equivalent protocol with full consent. Available to research partners.

Model parameters

4.2MB
ASL TFLite model (int8)
CNN-LSTM
ASL architecture
2.8MB
ArSL ONNX model
CNN-GRU
ArSL architecture
15 frames
ASL temporal window
20 frames
ArSL temporal window

Key research papers

MediaPipe Holistic

Lugaresi et al., "MediaPipe: A Framework for Building Perception Pipelines." CVPR Workshop, 2020. The backbone landmark extractor used in Together's real-time pipeline.

Sign Language Recognition via Skeletal Keypoints

Jiang et al., "Skeleton-Aware Multi-Modal Sign Language Recognition." CVPR 2021. Inspired the landmark-first approach rather than raw RGB input.

Efficient On-Device Inference

David et al., "TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems." MLSys 2021. Basis for the WASM-TFLite deployment strategy.

Arabic SLR Survey

Adaloglou et al., "A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition." IEEE TNNLS, 2022. Informed the CNN-GRU choice for ArSL.

Case studies

Cairo University — Pilot 2025

Deployed in 3 classrooms with Deaf students. ArSL recognition reduced reliance on human interpreters by 40% for structured question-answer sessions.

San Francisco Health Clinic — Pilot 2025

HandScript used at reception for patient intake. 87% of Deaf patients reported feeling more comfortable without waiting for a Video Remote Interpreting session.

Together Open Beta — 2025

1,200 users across 34 countries in the first 60 days. Average session length: 18 minutes. Most-used feature: Sign to Text (61%), followed by Live Meeting (24%).

External research & news

The field of sign language recognition is advancing rapidly. Notable recent developments include large-scale multi-lingual sign language datasets (OpenASL, How2Sign), transformer-based architectures (SignBERT, SPOTER), and the growing push for standardized sign language datasets across Arabic-speaking countries coordinated by the Arab League's accessibility working group.

Together contributes to this ecosystem by open-sourcing our internal annotation tooling and publishing benchmark results on standard test splits. We collaborate with the Gallaudet University Technology Access Program and the Arab Federation of the Deaf.

Partner with us on research

We welcome collaborations on dataset collection, model improvement, and field trials.

Partnership Enquiry Contact Research Team