HelloKindred
Information Technology and Services
Platform-SREEngineer
Neural analysis suggests this role is
optimal for mid candidates.
“Platform - SRE Engineer at HelloKindred. Skills: Site Reliability Engineering, DevOps, Cloud platforms, AI/ML production. Build CI/CD pipelines. Manage CI/CD pipelines”
What You'll Achieve.
Ensure platform stability; Ensure platform scalability; Ensure platform performance
Industry & Context.
Troubleshooting; System diagnostics; Remediation
BPSS clearance
What They're Looking For.
Must Have
5+ years DevOps experience, 5+ years SRE experience, Docker experience, Kubernetes experience, Cloud platforms experience, Infrastructure as Code experience, Monitoring tooling experience, Observability tooling experience, Operational tooling experience, CI/CD pipelines familiarity, Release automation familiarity, Secrets management familiarity, Production support processes familiarity, LLM deployment patterns understanding, API-based model integrations understanding, AWS experience, Jira experience, Confluence experience, ServiceNow experience, Troubleshooting capabilities, Operational support capabilities, Incident response capabilities
Nice to Have
AI/ML workloads production experience, GPU workloads experience, Autoscaling experience, Cost optimization experience
What You'll Do.
Build CI/CD pipelines
Manage CI/CD pipelines
Manage infrastructure
Build runtime environments
Manage runtime environments
Deploy model-serving workloads
Operate model-serving workloads
Deploy orchestration workloads
Operate orchestration workloads
Deploy application workloads
Operate application workloads
Implement operational dashboards
Manage scaling activities
Manage release processes
Manage rollback mechanisms
Manage production support operations
Optimize inference cost
Optimize system reliability
Create operational standards
Create incident response processes
Support infrastructure automation
Support platform engineering initiatives
Maintain observability solutions
Maintain monitoring solutions
Support release automation
Support secrets management
Support production operational processes
Collaborate with engineering teams
Support AI platform reliability
Support operational readiness
Troubleshoot production issues
Support system diagnostics
Support remediation activities
Ensure platform stability
Ensure platform scalability
Ensure platform performance
How You'll Work.
Team & Collaboration
Cross-functional engineering teams
Process & Methodology
Release processes, Release automation
Full Job Description
Who is HelloKindred? HelloKindred are specialists in staffing marketing, creative and technology roles, offering a range of talent solutions that can be delivered on-site, remotely or hybrid. Our vision is to make work accessible and people’s lives better. We do this by disrupting traditional employment barriers – connecting ambitious talent to flexible opportunities with trusted brands. Anticipated Contract End Date/Length: November 30, 2026. Work Set Up: Hybrid (3 days per week in office) Clearance required: BPSS Our client in the Information Technology and Services industry is looking for a Platform / SRE Engineer to own deployment, observability, reliability, cost control, and production operations for an AI helpdesk platform. This role will support the design, deployment, and operational management of AI services and production environments while ensuring scalability, uptime, performance optimization, and operational resilience across cloud-based infrastructure. The ideal candidate will bring strong expertise in DevOps and Site Reliability Engineering practices, along with experience managing cloud-native platforms, CI/CD pipelines, observability tooling, and AI/ML production workloads within complex enterprise environments. What you will do: * Build and manage CI/CD pipelines, infrastructure, and runtime environments for AI services. * Deploy and operate model-serving, orchestration, and application workloads. * Implement monitoring, tracing, alerting, logging, and operational dashboards. * Manage scaling activities, release processes, rollback mechanisms, and production support operations. * Optimize inference cost, latency, uptime, and overall system reliability. * Create runbooks, operational standards, and incident response processes. * Support infrastructure automation and platform engineering initiatives. * Maintain observability and monitoring solutions across production environments. * Support release automation, secrets management, and production oper
Applying for this Platform - SRE Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on SmartRecruiters
- SmartRecruiters often includes a video screening step — check camera and mic permissions.
- Link your GitHub or portfolio directly in the profile section for technical roles.
- Applications may be reviewed by AI scoring before reaching a recruiter — use keywords from the job description.
ANONYMOUS · UNFILTERED
What do employees actually say about HelloKindred?
Real rants from real employees. Read before you apply.