HelloKindred
Information Technology and Services
Platform-SREEngineer
Neural analysis suggests this role is
optimal for mid candidates.
“Platform - SRE Engineer at HelloKindred. Skills: Site Reliability Engineering, DevOps, Cloud platforms, AI/ML workloads. Build CI/CD pipelines. Manage CI/CD pipelines”
What You'll Achieve.
Ensure scalability; Ensure uptime; Ensure performance optimization; Ensure operational resilience
Industry & Context.
Troubleshooting; System diagnostics; Remediation activities
BPSS clearance
What They're Looking For.
Must Have
experience in DevOps, experience in Site Reliability Engineering, Experience with Docker, Experience with Kubernetes, Experience with cloud platforms, Experience with Infrastructure as Code practices, experience with monitoring, experience with observability, experience with operational tooling, Familiarity with CI/CD pipelines, Familiarity with release automation, Familiarity with secrets management, Familiarity with production support processes, Understanding of LLM deployment patterns, Understanding of API-based model integrations, Experience working with cloud platforms, Experience using Jira, Experience using Confluence, Experience using ServiceNow, troubleshooting capabilities, operational support capabilities, incident response capabilities, communication skills, collaboration skills
Nice to Have
Experience supporting AI/ML workloads, Experience with GPU workloads, Experience with autoscaling, Experience with cost optimization
What You'll Do.
Build CI/CD pipelines
Manage CI/CD pipelines
Manage infrastructure
Build runtime environments
Manage runtime environments
Operate model-serving workloads
Operate orchestration workloads
Operate application workloads
Implement operational dashboards
Manage scaling activities
Manage release processes
Manage rollback mechanisms
Manage production support operations
Optimize inference cost
Optimize system reliability
Create operational standards
Create incident response processes
Support infrastructure automation
Support platform engineering initiatives
Maintain observability solutions
Maintain monitoring solutions
Support release automation
Support secrets management
Support production operational processes
Collaborate with engineering teams
Support AI platform reliability
Support operational readiness
Troubleshoot production issues
Support system diagnostics
Support remediation activities
Ensure platform stability
Ensure platform scalability
Ensure platform performance
How You'll Work.
Team & Collaboration
Cross-functional engineering teams
Communication Scope
Communication skills
Full Job Description
Who is HelloKindred? HelloKindred are specialists in staffing marketing, creative and technology roles, offering a range of talent solutions that can be delivered on-site, remotely or hybrid. Our vision is to make work accessible and people’s lives better. We do this by disrupting traditional employment barriers – connecting ambitious talent to flexible opportunities with trusted brands. Anticipated Contract End Date/Length: November 30, 2026. Work Set Up: Hybrid (3 days per week in office) Clearance required: BPSS Our client in the Information Technology and Services industry is looking for a Platform / SRE Engineer to own deployment, observability, reliability, cost control, and production operations for an AI helpdesk platform. This role will support the design, deployment, and operational management of AI services and production environments while ensuring scalability, uptime, performance optimization, and operational resilience across cloud-based infrastructure. The ideal candidate will bring strong expertise in DevOps and Site Reliability Engineering practices, along with experience managing cloud-native platforms, CI/CD pipelines, observability tooling, and AI/ML production workloads within complex enterprise environments. What you will do: * Build and manage CI/CD pipelines, infrastructure, and runtime environments for AI services. * Deploy and operate model-serving, orchestration, and application workloads. * Implement monitoring, tracing, alerting, logging, and operational dashboards. * Manage scaling activities, release processes, rollback mechanisms, and production support operations. * Optimize inference cost, latency, uptime, and overall system reliability. * Create runbooks, operational standards, and incident response processes. * Support infrastructure automation and platform engineering initiatives. * Maintain observability and monitoring solutions across production environments. * Support release automation, secrets management, and production oper
Applying for this Platform - SRE Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on SmartRecruiters
- SmartRecruiters often includes a video screening step — check camera and mic permissions.
- Link your GitHub or portfolio directly in the profile section for technical roles.
- Applications may be reviewed by AI scoring before reaching a recruiter — use keywords from the job description.
ANONYMOUS · UNFILTERED
What do employees actually say about HelloKindred?
Real rants from real employees. Read before you apply.