Amazon.com Services LLC
Technology
DataScientistII
Neural analysis suggests this role is
optimal for Mid candidates.
“Data Scientist II at Amazon.com Services LLC. Skills: Data Science, Machine Learning, Generative AI, Data Evaluation. Design evaluation datasets. Develop evaluation datasets”
What You'll Achieve.
Ensure Generative AI is enterprise-ready; Ensure Generative AI is safe; Ensure Generative AI is effective
Industry & Context.
Data-driven decisions; Reasoning; Problem-solving
What They're Looking For.
Must Have
2+ years data scientist experience, 3+ years SQL experience, 3+ years Python experience, 3+ years R experience, 3+ years SAS experience, 3+ years Matlab experience, 3+ years machine learning modeling experience, 1+ years evaluating AI systems, 1+ years creating educational content, Master's degree in STEM, Experience applying theoretical models
Nice to Have
Ph.D. in STEM, Knowledge of ML concepts, Experience in ML role, Experience defining GenAI benchmarks, Experience on cross-disciplinary projects, Quantitative analysis for business problems, Data-driven business decisions
What You'll Do.
Design evaluation datasets
Develop evaluation datasets
Design benchmarking datasets
Develop benchmarking datasets
Leverage LLMs for data evaluation
Assess synthetic data quality
Create ground truth datasets
Create question-answer pairs
Lead human annotation initiatives
Lead model evaluation audits
Develop annotation guidelines
Refine annotation guidelines
Develop quality frameworks
Refine quality frameworks
Conduct statistical analysis
Measure model performance
Identify failure patterns
Guide improvement strategies
Translate evaluation insights
Build scalable data pipelines
Build tools for evaluation
Build tools for benchmarking
Contribute to Responsible AI
Develop safety evaluation datasets
Develop fairness evaluation datasets
How You'll Work.
Team & Collaboration
Collaborate with science teams; Collaborate with engineering teams; Collaborate with product teams; Collaborate with ML scientists; Collaborate with ML engineers
Communication Scope
Communicate complex concepts
Process & Methodology
Cross-disciplinary projects
Full Job Description
Amazon Quick Suite is an enterprise AI platform that transforms how organizations work with their data and knowledge. Combining generative AI-powered search, deep research capabilities, intelligent agents and automations, and comprehensive business intelligence, Quick Suite serves tens of thousands of users. Our platform processes thousands of queries monthly, helping teams make faster, data-driven decisions while maintaining enterprise-grade security and governance. From natural language interactions with complex datasets to automated workflows and custom AI agents, Quick Suite is redefining workplace productivity at unprecedented scale. We are seeking a Data Scientist II to join our Quick Data team, focusing on evaluation and benchmarking data development for Quick Suite features. Our mission is to engineer high-quality datasets that are essential to the success of Amazon Quick Suite. From human evaluations and Responsible AI safeguards to Retrieval-Augmented Generation and beyond, our work ensures that Generative AI is enterprise-ready, safe, and effective for users at scale. As part of our diverse team—including data scientists, engineers, language engineers, linguists, and program managers—you will collaborate closely with science, engineering, and product teams. We are driven by customer obsession and a commitment to excellence. Key job responsibilities In this role, you will leverage data-centric AI principles to assess the impact of data on model performance and the broader machine learning pipeline. You will apply Generative AI techniques to evaluate how well our data represents human language and conduct experiments to measure downstream interactions. Specific responsibilities include: * Design and develop comprehensive evaluation and benchmarking datasets for Quick Suite AI-powered features * Leverage LLMs for synthetic data corpora generation; data evaluation and quality assessment using LLM-as-a-judge settings * Create ground truth datasets with high-qu
Applying for this Data Scientist II role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Amazon.com Services LLC?
Real rants from real employees. Read before you apply.