Nile Bits

Computer Software

SeniorAppliedMLEngineer(Speech&Audio)

Cairo, Cairo Governorate, Egypt FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for mid candidates.

The Brief

“Senior Applied ML Engineer (Speech & Audio) at Nile Bits. Skills: Speech & Audio ML, Arabic Speech Technologies, Text-to-Speech (TTS), Automatic Speech Recognition (ASR), PyTorch, Hugging Face. design, fine-tune, and optimize advanced machine learning models for Arabic voice applications. work across the full development lifecycle, from data pipeline construction and model experimentation to inference optimization and production deployment”

What You'll Achieve.

scalable, low-latency systems that support natural and accurate Arabic speech interactions; production-ready solutions; robust and scalable speech systems

Industry & Context.

Computer Software

What They're Looking For.

Must Have

5+ years of experience in Machine Learning, Applied AI, or AI Research, programming skills in Python, Extensive hands-on experience with PyTorch and the Hugging Face ecosystem, Proven experience training and fine-tuning neural models for: Text-to-Speech (TTS), Automatic Speech Recognition (ASR), Audio codecs, Deep understanding of modern speech architectures such as: Whisper, Conformer, HiFi-GAN, Diffusion-based models, Experience with audio processing techniques including: Voice Activity Detection (VAD), Speaker Diarization, Neural Vocoders, Demonstrated ability to implement and adapt research papers into practical production experiments, understanding of Arabic language challenges, including: Diacritization (Tashkil), Dialectal variations, Code-switching, Experience with inference optimization techniques such as: Quantization, Streaming inference

Nice to Have

Experience developing custom NVIDIA CUDA kernels for high-performance model inference, Familiarity with speculative decoding and other advanced acceleration techniques, Experience deploying models at scale in cloud or GPU-based production environments, Contributions to open-source speech or machine learning projects

What You'll Do.

and optimize advanced machine learning models for Arabic voice applications

work across the full development lifecycle

from data pipeline construction and model experimentation to inference optimization and production deployment

Benchmark and evaluate TTS and ASR models using Arabic-specific test sets

Fine-tune generative models for voice cloning

zero-shot speaker adaptation

Build and maintain Arabic-focused data pipelines

including: Audio collection and preprocessing

Diacritization (Tashkil)

Data cleaning and augmentation

Optimize model inference for production environments

Integrate and evaluate complete speech-to-speech conversational pipelines

Conduct experiments based on recent research papers and convert findings into production-ready solutions

Collaborate with engineering and product teams to deploy robust and scalable speech systems

How You'll Work.

Team & Collaboration

Collaborate with engineering and product teams

Full Job Description

Project Overview Join a cutting-edge initiative focused on building advanced AI voice infrastructure for Arabic-speaking markets. The project involves developing state-of-the-art Arabic speech technologies, including: * Natural Text-to-Speech (TTS) * Real-Time Automatic Speech Recognition (ASR) * End-to-End Speech-to-Speech Conversational Systems The solutions are tailored to regional Arabic dialects, including Egyptian, Gulf, Levantine, and others. Job Description We are seeking a highly skilled Senior Applied Machine Learning Engineer with deep expertise in speech and audio technologies. In this role, you will design, fine-tune, and optimize advanced machine learning models for Arabic voice applications. You will work across the full development lifecycle, from data pipeline construction and model experimentation to inference optimization and production deployment. This position is ideal for engineers who are passionate about transforming cutting-edge research into scalable, low-latency systems that support natural and accurate Arabic speech interactions. Key Responsibilities * Benchmark and evaluate TTS and ASR models using Arabic-specific test sets, measuring metrics such as Word Error Rate (WER), naturalness, and dialect coverage. * Fine-tune generative models for voice cloning, zero-shot speaker adaptation, and speech synthesis. * Build and maintain Arabic-focused data pipelines, including: * Audio collection and preprocessing * Diacritization (Tashkil) * Data cleaning and augmentation * Optimize model inference for production environments using: * Quantization * KV-cache tuning * Streaming inference techniques * Integrate and evaluate complete speech-to-speech conversational pipelines. * Conduct experiments based on recent research papers and convert findings into production-ready solutions. * Collaborate with engineering and product teams to deploy robust and scalable speech systems. ## Qualifications Required Qualifications * 5+ years of experience in Machi

Free ATS check

Applying for this Senior Applied ML Engineer (Speech & Audio) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 45 detected · ranked by frequency

Text-to-Speech (TTS) ×5

Automatic Speech Recognition (ASR) ×5

PyTorch ×4

Arabic Speech Technologies ×3

Machine Learning ×3

Applied AI ×3

AI Research ×3

Audio codecs ×3

Speech-to-Speech Conversational Systems ×3

Audio processing techniques ×3

model inference optimization ×3

generative models for voice cloning ×3

zero-shot speaker adaptation ×3

speech synthesis ×3

data pipeline construction ×3

model experimentation ×3

inference optimization ×3

production deployment ×3

speculative decoding ×3

advanced acceleration techniques ×3

Speech & Audio ML ×2

Hugging Face ×2

Hugging Face ecosystem ×2

NVIDIA TensorRT ×2

Python

Whisper

Conformer

HiFi-GAN

Diffusion-based models

Voice Activity Detection (VAD)

Speaker Diarization

Neural Vocoders

BEHAVIOURAL

passionate about transforming cutting-edge research into scalable, low-latency systemscollaboration with engineering and product teamsvalues innovation and efficiencyFeedback encouragementFun, smart and creative people

Role Details

Experience 5–10 yrs

Level mid

Type FULL TIME

Category information-technology

AI-Extracted Insights

Domain Areas

arabic-language-challengesdiacritization-tashkildialectal-variationscode-switchingarabic-speech-technologiesregional-arabic-dialects

How to Apply on SmartRecruiters

SmartRecruiters often includes a video screening step — check camera and mic permissions.
Link your GitHub or portfolio directly in the profile section for technical roles.
Applications may be reviewed by AI scoring before reaching a recruiter — use keywords from the job description.

ANONYMOUS · UNFILTERED

What do employees actually say about Nile Bits?

Real rants from real employees. Read before you apply.

Read Company Rants →