Responsibilities
- Use proprietary software to provide labels, annotations, recordings, and inputs on projects involving multilingual audio clips, voice recordings, speech samples, and auditory elements in various languages.
- Support the delivery of high-quality curated audio data that ensures clear, natural spoken output, accurate representation of linguistic and prosodic details (such as intonation, rhythm, and accent), and professional audio standards.
- Collaborate with technical staff to develop tasks that improve AI's ability to handle speech modulation, accent variation, noise in real-world recordings, and multilingual audio processing.
- Work with technical staff to improve annotation tools for efficient audio workflows.
Basic Qualifications
- Native proficiency in Hungarian with exposure to diverse accents, dialects, or regional variations.
- Proficiency in English (minimum B2 level) with clear, natural vocal delivery and pronunciation suitable for audio recording purposes.
- Strong auditory perception to identify nuances in speech, accents, pronunciation, intonation, and audio quality across languages.
- Demonstrated ability to handle multilingual audio content, including evaluating speech accuracy, cultural vocal expressions, and contextual interpretation in spoken form.
- Demonstrated ability to transcribe audio with high accuracy across accents and varying audio quality.
- Comfort providing high-quality voice recordings and feedback on audio samples in multiple languages.
- Strong comprehension skills and the ability to make independent judgments on ambiguous or varied audio material, including noisy or accented speech.
- Strong communication, interpersonal, analytical, detail-oriented, and organizational skills, with the ability to articulate audio-related feedback effectively.
- Commitment to developing AI that masters sophisticated multilingual audio capabilities.
Preferred Skills and Experience
- Demonstration of exceptional attention to linguistic nuance, auditory detail, and data quality beyond standard transcription work.
- Deep understanding and taste of what good/useful Audio data is.
- Strong command of advanced transcription and annotation practices, including handling disfluencies, accents, and prosodic features (intonation, stress, rhythm, emotion, etc) with high consistency and accuracy.
- Background in linguistics (e.g., phonetics, phonology, sociolinguistics), speech sciences, cognitive science, or a related field, or equivalent practical experience, with demonstrated ability to analyze accent variation, pronunciation differences, and multilingual speech patterns.
- Experience working with speech/audio datasets, annotation workflows, or AI training data, including knowledge/experience with training voice models, and an understanding of how data quality impacts model performance.
- Professional experience in voice work, including voice acting, voice recording, podcasting with a measurable audience (e.g., X following), or similar audio production demonstrating attention to clarity and recording quality.
- Demonstrated ability to exercise independent judgment in ambiguous audio scenarios and make consistent, defensible annotation decisions.
- Portfolio (strongly preferred for advanced candidates): Voice samples, annotated transcripts, or audio-related work demonstrating quality, methodology, and attention to detail.
Compensation and Benefits
US-based candidates: $35/hour - $45/hour depending on factors including relevant experience, skills, education, geographic location, and qualifications.
International candidates: Information will be provided to you during the recruitment process.
Benefits vary based on employment type, location, and jurisdiction. Benefits for eligible U.S.-based positions include health insurance, 401(k) plan, and paid sick leave.