Webcam + voice triggers, OCR pipeline, and text-to-speech
Webcam + voice triggers, OCR pipeline, and text-to-speech
Designed and built a mannequin-style robot that used a webcam and voice-command triggers to capture images, convert them to text via an API-based OCR pipeline, and generate human-like speech from the extracted text.
Additional resources
Additional demo clips are attached on LinkedIn (Prepped-up Demo / Prepped-up Demo 2).