Research,
datasets & discoveries.
Everything that powers Together — open datasets, model architectures, case studies, and the research that inspired us.
Datasets used
Word-Level American Sign Language — 2000+ classes, 21K+ videos, 119 signers. Used as the primary ASL pre-training corpus. Li et al., 2020.
Microsoft American Sign Language Dataset — 1000 classes, 25K clips. Used for vocabulary expansion and transfer learning. Joze & Koller, 2019.
Public Egyptian Arabic Sign Language dataset — 32 classes, 54,049 images, 40 signers. Foundation for ArSL fingerspelling recognition. Latif et al., 2018.
3,000+ proprietary clips across 20 ArSL signs, 12 ASL signers. Collected under IRB-equivalent protocol with full consent. Available to research partners.
Model parameters
Key research papers
Lugaresi et al., "MediaPipe: A Framework for Building Perception Pipelines." CVPR Workshop, 2020. The backbone landmark extractor used in Together's real-time pipeline.
Jiang et al., "Skeleton-Aware Multi-Modal Sign Language Recognition." CVPR 2021. Inspired the landmark-first approach rather than raw RGB input.
David et al., "TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems." MLSys 2021. Basis for the WASM-TFLite deployment strategy.
Adaloglou et al., "A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition." IEEE TNNLS, 2022. Informed the CNN-GRU choice for ArSL.
Case studies
Deployed in 3 classrooms with Deaf students. ArSL recognition reduced reliance on human interpreters by 40% for structured question-answer sessions.
HandScript used at reception for patient intake. 87% of Deaf patients reported feeling more comfortable without waiting for a Video Remote Interpreting session.
1,200 users across 34 countries in the first 60 days. Average session length: 18 minutes. Most-used feature: Sign to Text (61%), followed by Live Meeting (24%).
External research & news
The field of sign language recognition is advancing rapidly. Notable recent developments include large-scale multi-lingual sign language datasets (OpenASL, How2Sign), transformer-based architectures (SignBERT, SPOTER), and the growing push for standardized sign language datasets across Arabic-speaking countries coordinated by the Arab League's accessibility working group.
Together contributes to this ecosystem by open-sourcing our internal annotation tooling and publishing benchmark results on standard test splits. We collaborate with the Gallaudet University Technology Access Program and the Arab Federation of the Deaf.
Partner with us on research
We welcome collaborations on dataset collection, model improvement, and field trials.
Partnership Enquiry Contact Research Team