TP-Link Systems Inc.

Technology

CloudOperationsEngineer

$95–135k ~AI est. Irvine, California, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Cloud Operations Engineer at TP-Link Systems Inc.. Skills: Cloud operations, Kubernetes, Infrastructure as Code, Reliability engineering. Design cloud-native infrastructure platforms. Build cloud-native infrastructure platforms”

Industry & Context.

Technology
Problems you'll solve

Troubleshooting complex issues; Diagnose infrastructure issues; Resolve infrastructure issues

Eligibility Requirements

Scheduled on-call rotation

What They're Looking For.

Must Have

Bachelor's degree or above, 2+ years of hands-on experience, Knowledge of AWS services, Hands-on experience operating Kubernetes, Familiarity with Kubernetes ecosystem tools, Experience with GitOps tools, Solid Linux administration skills, Experience with CI/CD pipelines, Good understanding of reliability engineering, Problem-solving skills, Good communication skills, Willingness to participate in on-call

Nice to Have

Experience with NVIDIA device plugins, Experience with Azure or Alibaba Cloud, Kubernetes certifications are a plus

What You'll Do.

Design cloud-native infrastructure platforms

Build cloud-native infrastructure platforms

Maintain cloud-native infrastructure platforms

Operate AWS environments

Optimize AWS environments

Manage Kubernetes clusters

Provision Kubernetes clusters

Upgrade Kubernetes clusters

Autoscale Kubernetes clusters

Manage Kubernetes networking

Manage Kubernetes observability

Plan Kubernetes capacity

Operate Kubernetes ecosystem components

Improve GitOps workflows

Manage Istio service mesh

Enhance Istio service mesh

Define reliability practices

Improve reliability practices

Monitor production cloud infrastructure

Alert on production cloud infrastructure

Troubleshoot production issues

Drive automation for infrastructure

Drive automation for CI/CD

Drive automation for observability

Drive automation for workflows

Collaborate with application engineering

Collaborate with architecture teams

Collaborate with security teams

Collaborate with platform teams

How You'll Work.

Team & Collaboration

Cross-functional engineering teams

Full Job Description

**ABOUT US:** Headquartered in the United States, **TP-Link Systems Inc.** is a global provider of reliable networking devices and smart home products, consistently ranked as the world’s top provider of Wi-Fi devices. The company is committed to delivering innovative products that enhance people’s lives through faster, more reliable connectivity. With a commitment to excellence, TP-Link serves customers in over 170 countries and continues to grow its global footprint. We believe technology changes the world for the better! At TP-Link Systems Inc, we are committed to crafting dependable, high-performance products to connect users worldwide with the wonders of technology. Embracing professionalism, innovation, excellence, and simplicity, we aim to assist our clients in achieving remarkable global performance and enable consumers to enjoy a seamless, effortless lifestyle. **KEY RESPONSIBILITIES** * Design, build, and maintain reliable, scalable, and secure cloud-native infrastructure platforms supporting large-scale production workloads. * Operate and optimize multi-account AWS environments, ensuring infrastructure is secure, repeatable, and auditable through Infrastructure as Code tools such as Terraform. * Manage production Kubernetes clusters, including provisioning, upgrades, autoscaling, networking, observability, capacity planning, and day-to-day operations. * Build and operate Kubernetes ecosystem components such as CRDs, Helm, HPA, Cluster Autoscaler, CoreDNS, and Cluster API. * Operate and improve GitOps-based deployment workflows using tools such as FluxCD or ArgoCD. * Manage and enhance Istio service mesh capabilities, including traffic routing, service discovery, resilience, security, and service-to-service communication. * Define and improve reliability practices, including SLOs, Error Budgets, monitoring, alerting, incident response, and post-mortems. * Participate in a scheduled on-call rotation to support production cloud infrastructure and Kubernetes p

Free ATS check

Applying for this Cloud Operations Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about TP-Link Systems Inc.?

Real rants from real employees. Read before you apply.

Read Company Rants →