👋 Hi, I’m Akash, an applied researcher/engineer with experience in speech, audio (at Microsoft), and most recently multi-modal document understanding and retrieval (at Contextual AI). Turns out that this completes the trio of audio, vision & text AI multimodality. :)

I’m currently on a brief sabbatical, exploring ideas & tinkering as I work out what’s next. Currently exploring real-time, on-device neural audio in the context of music and voice.

Work

Contextual AI

Wrangled millions of pages to land the first $ millions in enterprise contracts :)

Microsoft

Fun fact: ~6M hours of monthly traffic equals 1 year of conversations transcribed per hour!

Misc

  • [2023] 🐥🗣️ Open source contribution to whisper.cpp (38k stars). tinydiarize is a lightweight prototype extending OpenAI’s Whisper model for speaker diarization, runnable on Macbooks/iPhones.
  • [2020] 🐋 Co-founded OrcaHello, a system for 24/7 monitoring of Southern Resident Killer Whales across many underwater hydrophones in the Pacific Northwest. It was awarded a $30,000 AI for Earth Innovation Grant in 2020 and has been operating live for >4 years - listen here.