Fluency

SaaS

AIEngineer

$180–250k San Francisco, California, United States; New York City, New York, United States FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“AI Engineer at Fluency. Skills: AI Engineering, Prompt Engineering, Agent Design, Evaluation Systems, LLM-powered features, Production Engineering. Own the prompts, agents, evals, and pipelines behind user-facing features. Turn product requirements into working prompts, agents, and pipelines”

What You'll Achieve.

Ship AI features to users; Improve AI output quality; Ensure system reliability for production

Industry & Context.

SaaS

Problems you'll solve

Work through ambiguous AI problems

Eligibility Requirements

E-3 sponsorship for Australians to relocate with stipend

What They're Looking For.

Must Have

Hands-on experience building LLM-powered features that shipped to real users, Production engineering chops in TypeScript/Node (primary, especially in AWS Lambda) and/or Python, Experience with multiple LLM providers such as Anthropic, OpenAI, Google Vertex, AWS Bedrock, or similar, Practical judgment in prompt engineering, retrieval, and agent design, backed by evaluation results, Track record of building evaluation systems that actually catch regressions, Solid software engineering fundamentals: you can write production code, not just notebooks

Nice to Have

Experience with provider-abstraction libraries for multi-LLM workflows, Familiarity with pgvector or other vector retrieval systems, Experience with post-training or fine-tuning, Experience deploying AI features on AWS Lambda, ECS Fargate, or similar, Background in ML, NLP, or applied research, Experience with structured output, function calling, and tool use at scale, Experience with Anyscale Ray or similar distributed compute frameworks for batch inference, eval pipelines, or scaling agent workloads, Open source contributions in the LLM or agent tooling space

What You'll Do.

and pipelines behind user-facing features

Turn product requirements into working prompts

Evaluate them rigorously

iterate until they're production-ready

and keep improving them once they ship

Build new AI features end to end

from prototype to production

Improve AI output quality through prompt engineering

Design and run evals that measure real output quality

not just first impressions

Iterate fast on prompts

and orchestration patterns

and techniques when they improve quality

How You'll Work.

Team & Collaboration

Partner with the Product Engineer to translate requirements into AI features that actually work; Partner with the AI Platform team to land features on solid infrastructure

Full Job Description

We're hiring a full-time AI Engineer to own the prompts, agents, evals, and pipelines behind user-facing features that ship to users. You'll take product requirements and turn them into working prompts, agents, and pipelines. You'll evaluate them rigorously, iterate until they're production-ready, and keep improving them once they ship. This role sits at the intersection of product and platform: you decide what the AI should do, prove it works, and get it in front of users. Because we're an early-stage company moving fast, we're looking for someone who can work quickly through ambiguous AI problems, measure output quality, and ship only when the system is reliable enough for production. This is an in-person role, 5 days a week in our office. The ability to tell the difference between "looks good in the demo" and "works in production" is essential. KEY RESPONSIBILITIES - Build new AI features end to end, from prototype to production. - Improve AI output quality through prompt engineering, model selection, retrieval, and evaluation. - Design and run evals that measure real output quality, not just first impressions. - Iterate fast on prompts, agent designs, and orchestration patterns. - Partner with the Product Engineer to translate requirements into AI features that actually work. - Partner with the AI Platform team to land features on solid infrastructure. - Evaluate new models, tools, and techniques when they improve quality, latency, cost, or reliability. WHAT WE ARE LOOKING FOR - Hands-on experience building LLM-powered features that shipped to real users - Production engineering chops in TypeScript/Node (primary, especially in AWS Lambda) and/or Python - Experience with multiple LLM providers such as Anthropic, OpenAI, Google Vertex, AWS Bedrock, or similar - Practical judgment in prompt engineering, retrieval, and agent design, backed by evaluation results - Track record of building evaluation systems that actually catch regressions - Solid software eng

Free ATS check

Applying for this AI Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 39 detected · ranked by frequency

Distributed compute ×4

Prompt Engineering ×3

Agent Design ×3

Production Engineering ×3

Building LLM-powered features ×3

Production code ×3

Building evaluation systems ×3

Provider-abstraction libraries ×3

Vector retrieval systems ×3

Post-training ×3

Fine-tuning ×3

Deploying AI features ×3

AI Engineering ×2

Evaluation Systems ×2

LLM-powered features ×2

AWS Lambda ×2

Anthropic ×2

OpenAI ×2

Google Vertex ×2

AWS Bedrock ×2

pgvector ×2

AWS ×2

ECS Fargate ×2

Anyscale Ray ×2

TypeScript

Node

Python

LLM

Retrieval

Evaluation

NLP

BEHAVIOURAL

Work quickly through ambiguous AI problemsMeasure output qualityShip only when the system is reliable enough for productionAbility to tell the difference between 'looks good in the demo' and 'works in production' is essentialPractical judgment

Role Details

Experience 2–5 yrs

Level Mid

Work Mode in-person

Type FULL TIME

Category engineering

Salary Band 150k-200k

AI-Extracted Insights

Domain Areas

aillmnlpmlproduct-developmentplatform-engineering

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Fluency?

Real rants from real employees. Read before you apply.

Read Company Rants →