Research Intern – Audio-Visual VoiceAI (Open Source)
•2 days ago
| Hours | Full-time, Part-time |
|---|---|
| Location | Alameda, California |
About this job
Research Intern – Audio-Visual VoiceAI (Open Source)
We’re looking for a Research Intern to join WhissleAI and help advance our open-source work at the intersection of speech, vision, and structured understanding — inspired by projects like
- advanced speech recognition asr.whissle.ai
- and recent multi-modal alignment research (example: )
You’ll work on developing audio-visual foundation models that connect voice, context, and environment — enabling systems that can listen, see, and act coherently in real time. Most of this work is open-source and contributes directly to the broader research community.
Ideal candidate
- Undergrad, Master’s, or PhD student in CS, AI, or related field
- Prior research experience (conference/workshop publications a plus)
- Strong background in one or more of: multimodal learning, audio-visual representation learning, speech modeling, or self-supervised methods
- Experience with PyTorch, Hugging Face, or similar frameworks
What you’ll do
- Prototype and evaluate audio-visual alignment models
- Extend our open-source ASR and meta-speech pipelines
- Collaborate on papers, demos, and real-time VoiceAI applications
Location: Remote
- Type: Paid internship / research collaboration
Nearby locations
Nearby Job Titles
Radiologic Technologist Jobs Registered Nurse Jobs Financial Advisor Jobs Retail Salesperson Jobs Nurse Practitioners JobsNearby Locations
San Francisco, CA Jobs Oakland, CA Jobs Fremont, CA Jobs Walnut Creek, CA Jobs California JobsNearby Companies
Kaiser Jobs Care.com Jobs Maxion Research Jobs AlliedTravelCareers Jobs U.S. Navy JobsNearby Categories
Full-time Jobs Part-time Jobs Gig Jobs Posting ID: 1184860281 Posted: 2025-11-21 Job Title: Research Intern