Weekday AI

ScientificAIEvaluation&ComputationalProblemDesigner

$0–0k Mexico; India PART TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Scientific AI Evaluation & Computational Problem Designer at Weekday AI. Skills: Scientific AI Evaluation, Computational Problem Design, Domain-specific scientific software expertise, Python programming, Reasoning complexity assessment. Design advanced computational problems requiring the use of domain-specific scientific software. Create tasks that test both precise execution (multi-step workflows, simulations) and strategic reasoning (experiment design, inference from partial data)”

What You'll Achieve.

Building a large-scale evaluation benchmark to test advanced AI reasoning across scientific and engineering domains; Ensuring the right balance of difficulty, depth, and reasoning complexity in problems

Industry & Context.

Problems you'll solve

Designing rigorous, research-grade computational problems; Assessing AI reasoning effectiveness; Leveraging scientific software tools for complex challenges; Developing problem setups, solution pathways, and validation mechanisms; Calibrating and refining tasks based on model performance; Ensuring problems emphasize reasoning strategy over brute-force computation

Eligibility Requirements

Work must not involve sharing confidential or proprietary information from any current or past employer or institution, This opportunity does not currently support certain work authorization categories

What They're Looking For.

Must Have

Graduate-level expertise (MS or PhD) in a relevant STEM field, Hands-on experience using scientific software libraries for real research problems, Python programming skills, including building computational workflows and validators, Ability to design challenging problems that require deep reasoning rather than surface-level solutions, Familiarity with edge cases, limitations, and practical challenges of scientific tools, Demonstrated proficiency with at least one relevant scientific library (via research, open-source work, or industry experience), Ability to work independently and iterate based on feedback, Comfort working in Linux/terminal environments and remote compute setups, Availability of at least 15–20 hours per week

Nice to Have

MS or PhD preferred, Experience across multiple domains or tools, Background in evaluation frameworks or benchmarking, Experience in teaching, pedagogy, or problem-set design, Familiarity with reproducible research practices and containerized environments

What You'll Do.

Design advanced computational problems requiring the use of domain-specific scientific software

Create tasks that test both precise execution (multi-step workflows

simulations) and strategic reasoning (experiment design

inference from partial data)

Develop problem setups

and validation mechanisms

Calibrate and refine tasks based on model performance to achieve target difficulty levels

Ensure problems emphasize reasoning strategy over brute-force computation

Iteratively refine problems through calibration against state-of-the-art AI models

Full Job Description

**This role is for one of our clients** **Compensation: $45-$100 per hour ** We are building a large-scale evaluation benchmark to test advanced AI reasoning across scientific and engineering domains. This role focuses on designing rigorous, research-grade computational problems that assess how effectively AI systems can leverage real scientific software tools to solve complex challenges. Unlike traditional annotation roles, this position requires creating original, graduate-level problems rooted in real-world scientific workflows. You will iteratively refine these problems through calibration against state-of-the-art AI models, ensuring the right balance of difficulty, depth, and reasoning complexity. **Requirements** **What You’ll Do** * Design advanced computational problems requiring the use of domain-specific scientific software * Create tasks that test both precise execution (multi-step workflows, simulations) and strategic reasoning (experiment design, inference from partial data) * Develop problem setups, solution pathways, and validation mechanisms * Calibrate and refine tasks based on model performance to achieve target difficulty levels * Ensure problems emphasize reasoning strategy over brute-force computation **Domains & Tools of Interest** We are particularly seeking candidates with hands-on experience in: * **Bioinformatics & Single-Cell Genomics:** scanpy, scvelo, squidpy, gudhi (RNA-seq, trajectory inference, spatial transcriptomics) * **Computational Chemistry:** PySCF (HF, DFT, TDDFT, CASSCF, post-HF methods) * **Particle & Nuclear Physics:** scikit-hep, Monte Carlo simulations, collider data analysis * **Electrical Engineering:** scikit-rf, ngspice (RF systems, circuit simulation) * **Astrophysics & Cosmology:** astropy (cosmological modeling, survey analysis) * **Structural & Mechanical Engineering:** scikit-fem (finite element analysis, elasticity, beam theory) * **Seismology & Geophysics:** ObsPy, SPECFEM (waveform analysis, inversion, tomogra

Free ATS check

Applying for this Scientific AI Evaluation & Computational Problem Designer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Weekday AI?

Real rants from real employees. Read before you apply.

Read Company Rants →