Risk Labs
crypto infrastructure
SeniorLLMSystemsEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior LLM Systems Engineer at Risk Labs. Skills: LLM Systems, Production Systems, Evaluation, Resilience. Improve LLM accuracy. Improve system performance”
What You'll Achieve.
oracle automation system handles wider range; higher measured accuracy; LLM quality tracked through evaluations; LLM quality tracked through regressions; Engineers and operators can inspect model behavior; Engineers and operators can inspect tool usage; Engineers and operators can inspect reasoning paths; Engineers and operators can inspect uncertainty; Latency and cost improve; System fails more gracefully
Industry & Context.
reason carefully about correctness
What They're Looking For.
Must Have
Python, TypeScript, LLMs, agents, retrieval, model-powered workflows, evaluations, test datasets, regression checks, quality metrics, manual review loops, AI systems, APIs, databases, queues, logs, model outputs, external data sources, prompt engineering, tool calling, structured output validation, retrieval, LLM failure modes, correctness in uncertain or adversarial environments, High agency, ownership, clear written communication
Nice to Have
oracle systems, prediction markets, DeFi protocols, crypto infrastructure, UMA, optimistic oracle mechanisms, Polymarket, agentic systems, tools, search, browser automation, APIs, database queries, LLM tracing, model monitoring, evaluation frameworks, AI observability tools, model cost, latency at scale, Postgres, data pipelines, queue-based systems, background jobs, event-driven architectures, blockchain operational constraints, RPC limits, indexing, event logs, finality, chain-specific behavior, GCP, Cloud Run, GitHub Actions, Terraform
What You'll Do.
Improve system performance
Preserve decision quality
Preserve operational reliability
Design uncertainty handling
Design human review paths
Build regression tests
Improve agent orchestration
Investigate regressions
Reduce operator friction
How You'll Work.
Team & Collaboration
Work with product team
Communication Scope
clear written communication
Full Job Description
WHY THIS ROLE EXISTS: We are hiring a Senior LLM Systems Engineer to own and improve the LLM-driven components of our oracle automation stack. This person will focus on the accuracy, performance, resilience, and operational quality of the systems that use models to reason about wide ranging prediction market rules, evidence, and oracle outcomes. This is a production systems role, not a research-only or prompt-only role. You will build the evaluations, observability, tooling, fallbacks, and feedback loops that make LLM behavior measurable and dependable in real-world conditions. WHAT YOU'LL OWN: - LLM Accuracy: improve prompts, model selection, tool usage, structured outputs, retrieval, and evaluation coverage so the system gets more decisions right over time. - System Performance: reduce latency, token usage, and cost while preserving decision quality and operational reliability. - Resilience: design validation, retries, fallbacks, uncertainty handling, and human review paths for ambiguous, adversarial, incomplete, or conflicting inputs. - Evaluation and Monitoring: build datasets, regression tests, dashboards, traces, and review loops that make model quality visible and prevent repeated failures. - Agent and Tooling Architecture: Improve agent orchestration and tool use across internal services, APIs, search workflows, databases, and external data sources. - Production Operations: help debug live issues, investigate regressions, improve runbooks, and reduce repeated operator friction. WHAT SUCCESS LOOKS LIKE: - The oracle automation system handles a wider range of market and resolution scenarios with higher measured accuracy. - LLM quality is tracked through evaluations and regressions instead of judged only through manual spot checks. - Engineers and operators can inspect model behavior, tool usage, reasoning paths, and uncertainty when investigating outcomes. - Latency and cost improve without hiding quality regressions. - The system fails more gracefully when da
Applying for this Senior LLM Systems Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Risk Labs?
Real rants from real employees. Read before you apply.