Abaka AI
AI
DataOperationsEngineer
“Data Operations Engineer at Abaka AI. Skills: dataset library understanding, data quality validation, internal data support. Develop and maintain a comprehensive understanding of Abaka AI’s dataset library, including data structure, quality, and applicable use cases across modalities (text, image, video, audio, 3D). Serve as the internal point of contact for dataset-related inquiries, providing clear and timely responses to questions from engineering, product, and business teams”
What You'll Achieve.
ensure fast, accurate, and scalable access to data; play a critical role in improving how datasets are organized, accessed, and utilized across the company
Industry & Context.
problem-solving ability; problem-solving skills
Professional proficiency in Mandarin Chinese and English is required, as this role involves frequent collaboration with China-based vendors and external partners
What They're Looking For.
Must Have
Bachelor's degree in Computer Science, Data Engineering, or a related field, or equivalent practical experience, 1–4 years of experience in data operations, data engineering, or a related role involving direct interaction with datasets, Professional proficiency in Mandarin Chinese and English, problem-solving skills, Proficiency in SQL and/or Python for data inspection, validation, and basic analysis, Experience working with real-world datasets, including handling data quality issues, inconsistencies, and edge cases, communication skills, High level of ownership and accountability, ability to manage multiple requests and priorities simultaneously
Nice to Have
Experience with multimodal datasets (text, image, video, audio, or 3D), Familiarity with data annotation, labeling workflows, or dataset preparation for machine learning, Experience working with international teams, particularly in cross-border environments, Exposure to AI/ML workflows, including training, fine-tuning, or evaluation datasets
What You'll Do.
Develop and maintain a comprehensive understanding of Abaka AI’s dataset library
including data structure
and applicable use cases across modalities (text
Serve as the internal point of contact for dataset-related inquiries
providing clear and timely responses to questions from engineering
Translate ambiguous or high-level requests into concrete dataset solutions
identifying appropriate data sources or gaps
Inspect and validate datasets for quality
and consistency using SQL
or other tools as needed
Coordinate with global data teams
including teams in China
to resolve data issues
and ensure timely delivery without unnecessary escalation
Maintain and improve internal documentation
and accessibility of datasets
Identify inefficiencies in current workflows and propose improvements to systems
and processes that support dataset management and usage
Support cross-functional initiatives by providing dataset insights
and operational guidance
How You'll Work.
Team & Collaboration
working closely with engineering, product, and business teams; coordinate across global teams; Support cross-functional initiatives
Communication Scope
communication skills; ability to work across technical and non-technical teams
Process & Methodology
manage multiple requests and priorities simultaneously
Applying for this Data Operations Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about Abaka AI?
Real rants from real employees. Read before you apply.