ProdataKey
SiteReliabilityEngineer(GKE)
Neural analysis suggests this role is
optimal for Mid candidates.
“Site Reliability Engineer (GKE) at ProdataKey. Skills: Site Reliability Engineering, Cloud Infrastructure, Automation, Observability. Design cloud infrastructure. Build cloud infrastructure”
Industry & Context.
Problem-solver
On-call rotation
What They're Looking For.
Must Have
Bachelor's or Master's degree, 3+ years SRE/DevOps/Infra experience, AWS or GCP experience, Docker and Kubernetes experience, Python, Go, TypeScript, or Bash proficiency, Linux/Unix administration fundamentals, Networking protocols fundamentals, Cloud security best practices, Pass drug and criminal background check
Nice to Have
Multi-region deployments experience, Messaging systems experience, Operating production systems at scale, Familiarity with modern data stack, GCP ecosystem and tooling familiarity, Experience in scaling startup environment, Experience in physical security, Experience in access control systems
What You'll Do.
Design cloud infrastructure
Build cloud infrastructure
Maintain cloud infrastructure
Own platform availability
Own platform performance
Own platform capacity planning
Develop monitoring systems
Manage monitoring systems
Develop logging systems
Manage logging systems
Develop alerting systems
Manage alerting systems
Participate in on-call rotation
Lead incident response mitigation
Drive blameless post-mortems
Optimize deployment pipelines
Secure deployment pipelines
Partner with backend developers
Partner with hardware engineers
Define Service Level Indicators
Define Service Level Objectives
Contribute to technical documentation
How You'll Work.
Team & Collaboration
Cross-functional collaboration; Collaborative on-call rotation; Partner with developers; Partner with hardware engineers
Full Job Description
**Role Overview** We are seeking a hands-on**** Site Reliability Engineer to join our team. You will bridge the gap between software development and infrastructure operations, treating operational challenges as engineering problems. By leveraging automation, designing resilient distributed systems, and championing observability, you will ensure that our customers can secure and manage their access control systems without friction or failure. **Key Responsibilities** * **Infrastructure & Automation:** Design, build, and maintain scalable, secure multi-tenant cloud infrastructure using Infrastructure as Code (IaC) principles. * **Uptime & Reliability:** Own the availability, latency, performance, and capacity planning of the [pdk.io](http://pdk.io/) platform and its supporting backend microservices. * **Observability:** Develop and manage robust monitoring, logging, and alerting systems to gain deep visibility into cloud infrastructure, API health, and IoT endpoint performance. * **Incident Response:** Participate in a collaborative on-call rotation. Lead rapid incident response mitigation and drive rigorous, blameless post-mortems to ensure long-term system resilience. * **CI/CD Pipeline Management:** Optimize and secure automated deployment pipelines to enable developers to ship code to production safely and efficiently. * **Cross-Functional Collaboration:** Partner closely with backend developers and hardware engineering teams to define Service Level Indicators (SLIs), Service Level Objectives (SLOs), and manage error budgets. * Contribute to technical documentation and knowledge sharing **Tooling** * CI automation with GitHub Actions and Argo Workflows * IaC with OpenTofu & Terragrunt at the core * GitOps continuous delivery using ArgoCD * Observability stack powered by Prometheus and Grafana **Required Qualifications** * Bachelor’s or Master’s degree in Computer Science, Information Systems, or related fields or equivalent experience * 3+ years of experience in a
Applying for this Site Reliability Engineer (GKE) role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about ProdataKey?
Real rants from real employees. Read before you apply.