Tricentis

SaaS

SeniorDirector,CloudandSiteReliabilityEngineering

Czech Republic FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Director candidates.

The Brief

“Senior Director, Cloud and Site Reliability Engineering at Tricentis. Skills: Cloud infrastructure strategy, Site Reliability Engineering (SRE), AWS, Azure, GCP, Kubernetes, Terraform, Incident management, Observability, Automation. Define and execute the cloud infrastructure roadmap. Establish cloud architecture standards and best practices”

What You'll Achieve.

Ensuring the highest levels of availability, reliability, and performance; Support Tricentis' SaaS platform growth, reliability, and scalability goals; Align cloud spending with business outcomes; Advance platform capabilities; Align cloud and infrastructure initiatives with product roadmap and business goals; Reflect customer expectations and business commitments; Scale the team to meet enhance performance and reliability of our SaaS products; Improve MTTR; Reduce toil and improve system resilience; Ensure systems are observable, scalable, and fault tolerant; Increase consistency and reduce operational risk; Reporting regularly to senior leadership on platform health

Industry & Context.

SaaS

Problems you'll solve

Solve Problems Together: We win or lose as one team

Eligibility Requirements

On-call strategy, Global Sanctions Compliance, Candidates must not be listed on any government restricted party lists (including OFAC SDN List and U. S. Commerce Department restricted lists) and must certify that their employment would not violate any sanctions or export control regulations.

What They're Looking For.

Must Have

10 + years of experience in cloud infrastructure, DevOps, or Site Reliability Engineering, at least 5 years in senior engineering leadership roles, Proven track record leading Cloud or SRE organizations at scale within SaaS or enterprise software companies, Deep expertise in major cloud platforms (AWS, Azure, and/or GCP) including computer, networking, storage, security, and managed services, Strong background in SRE principles, including SLO/SLI/error budget frameworks, observability, chaos engineering, and incident management, Hands-on experience with Kubernetes, Terraform, CI/CD tooling, and modern infrastructure-as-code practices, Experience with compliance frameworks (SOC 2, ISO 27001, FedRAMP, GDPR) and operating in regulated environments, Excellent communication and influencing skills, with the ability to translate complex technical concepts into clear business impact

Nice to Have

AI and Agentic capabilities, multi-cloud, hybrid-cloud, and cloud-native strategies, Kubernetes, Terraform, Pulumi, SOC 2, ISO 27001, ISO 42001, GDPR, FedRAMP

What You'll Do.

Define and execute the cloud infrastructure roadmap

Establish cloud architecture standards and best practices

Drive infrastructure cost optimization and efficiency

Lead the adoption of modern cloud technologies and emerging capabilities (AI and Agentic)

Build and mature the SRE function defining SLOs

Enhance operational effectiveness through the deployment and use of agentic capabilities

Own the incident management and on-call strategy

Champion a culture of reliability embedding SRE principles

Drive automation across infrastructure provisioning

and self-healing systems

Partner with Security to ensure cloud environments meet compliance

Influence infrastructure design earlier in the agentic development process

Oversee infrastructure delivery and operational readiness for all product releases

Drive continuous improvement in CI/CD pipelines

Establish and enforce infrastructure-as-code practices

Define and track key reliability

and availability of metrics

How You'll Work.

Team & Collaboration

Collaborate with peer Engineering and Product leaders; Partner with Finance and Engineering leadership; Work with Engineering teams to influence infrastructure design; Drive continuous improvement in CI/CD pipelines, deployment processes, and DevOps tooling in partnership with product engineering teams

Communication Scope

Excellent communication and influencing skills; Ability to translate complex technical concepts into clear business impact

Process & Methodology

Define and execute the cloud infrastructure roadmap, Oversee infrastructure delivery and operational readiness for all product releases

Full Job Description

We are looking for an experienced and strategic leader to build and scale our Cloud and Site Reliability Engineering (SRE) organization. You will define and drive the cloud infrastructure strategy and operational excellence that underpins Tricentis' SaaS platform, ensuring the highest levels of availability, reliability, and performance. You will lead a team of talented Cloud Engineers and SREs, fostering a culture of excellence, automation-first thinking, and continuous improvement. **What you will do:** **Cloud Strategy & Infrastructure Leadership** * **Define and execute the cloud infrastructure roadmap** to support Tricentis' SaaS platform growth, reliability, and scalability goals across AWS, Azure, and GCP. * **Establish cloud architecture standards and best practices** including multi-cloud, hybrid-cloud, and cloud-native strategies. * **Drive infrastructure cost optimization and efficiency,** partnering with Finance and Engineering leadership to align cloud spending with business outcomes. * **Lead the adoption of modern cloud technologies and emerging capabilities**(AI and Agentic) to advance platform capabilities. * **Collaborate with peer Engineering and Product leaders** to align cloud and infrastructure initiatives with product roadmap and business goals. **Site Reliability Engineering & Operational Excellence** * **Build and mature the SRE function** defining SLOs, SLIs, and error budgets that reflect customer expectations and business commitments. * Enhance operational effectiveness through the deployment and use of agentic capabilities to scale the team to meet enhance performance and reliability of our SaaS products. * **Own the incident management and on-call strategy** to establish effective processes for detection, response, remediation, and post-incident review improving MTTR. * **Champion a culture of reliability** embedding SRE principles across the broader Engineering organization to reduce toil and improve system resilience. Drive automation

Free ATS check

Applying for this Senior Director, Cloud and Site Reliability Engineering role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 36 detected · ranked by frequency

Kubernetes ×7

Terraform ×7

Site Reliability Engineering (SRE) ×5

Incident management ×5

Observability ×5

CI/CD tooling ×4

Cloud infrastructure strategy ×3

AWS ×3

Azure ×3

GCP ×3

Cloud infrastructure ×3

Cloud platforms (AWS, Azure, GCP) ×3

Computer networking ×3

Storage ×3

Security ×3

Managed services ×3

SRE principles ×3

SLO/SLI/error budget frameworks ×3

Chaos engineering ×3

Infrastructure-as-code practices ×3

Compliance frameworks (SOC 2, ISO 27001, FedRAMP, GDPR) ×3

Automation across infrastructure provisioning, monitoring, observability, and self-healing systems ×3

Automation ×2

Pulumi

CI/CD pipelines

infrastructure-as-code

Agentic capabilities

Operational excellence

Infrastructure cost optimization

Business outcomes alignment

Product roadmap alignment

BEHAVIOURAL

Strategic leadershipAutomation-first thinkingContinuous improvementCollaborationInfluencing skillsSelf-AwarenessFinish What We StartMove FastRun Towards ChangeServe Our Customers & CommunitiesSolve Problems TogetherThink Big & Believe

Role Details

Seniority executive

Experience 10–+ yrs

Level Director

Work Mode Hybrid: 3 days in the office

Type FULL TIME

AI-Extracted Insights

Domain Areas

saas-platformenterprise-software-companiesregulated-environmentssoftware-quality-assurancecontinuous-testing-toolsdevops

Certifications

SOC 2ISO 27001ISO 42001GDPRFedRAMP

How to Apply on Workday

Workday has a multi-step form — save your progress after every section.
"Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Tricentis?

Real rants from real employees. Read before you apply.

Read Company Rants →