Td Synnex

PlatformSRE

Bogota, Colombia FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Platform SRE at Td Synnex. Skills: Platform reliability, Automation & IaC, Incident/problem/RCA, Observability standards. Ensure reliability, operability, and continuous improvement of platforms. Own L3 reliability, define SLOs/KPIs”

What You'll Achieve.

Ensure reliability, operability, and continuous improvement; Reduce toil; Reduce MTTR/MTTD; Improve alert quality; Drive continuous improvement

Industry & Context.

Problems you'll solve

Problem solving; Analytical skills; Troubleshooting; Root Cause Analysis (RCA)

Eligibility Requirements

On-call, 24/7 operations

What They're Looking For.

Must Have

5+ years in platform/SRE/operations/platform engineering with production ownership in large-scale environments, Hands-on hybrid operations (cloud + on-prem) with enterprise cloud fundamentals (compute, networking, storage, identity), Production IaC and automation (Terraform, Ansible), scripting with Python/PowerShellash, Proven L3 incident troubleshooting and major incident leadership, infrastructure fundamentals: networking (including DNS/DHCP concepts), virtualization, storage, Windows Server and/or Linux, ITSM experience (incident, problem, change) and ticket-based operations, Azure platform knowledge

Nice to Have

SRE practices (SLOs, error budgets, postmortems, toil reduction), Virtualization and backup/DR operations experience, Exposure to containers and DevOps/CI/CD; configuration drift control, Python for operational familiarity with ML/DL for anomaly detection, forecasting, clustering, Experience in large, multinational, 24/7 operations, Knowledge of AI/agentic approaches and modern automation patterns

What You'll Do.

and continuous improvement of platforms

Lead operability gates and production maintain runbooks/SOPs

Design/build operational automation

Develop Terraform/Ansible script

Integrate with ITSM for self-service

and recovery for major incidents

Drive problem management

preventive reduce MTTR/MTTD

Define actionable signals

Tune alerting to reduce noise

Advance predictive/proactive operations

Support Python-based analytics and ML/DL

Equip provider with clear runbooks

Govern performance and ITSM

Partner with Platform Engineering

Feed operational insights into design

Mentor peers and promote engineering-led operations culture

How You'll Work.

Team & Collaboration

Partner with Platform Engineering; Mentor peers; Collaborate with teams and vendors; Work with product team

Communication Scope

Clear communicator in global, matrixed environments; Effective cross-team/vendor partner

Full Job Description

# Role purpose: * Ensure reliability, operability, and continuous improvement of TD SYNNEX enterprise platforms across hybrid cloud and on‑prem environments. * Engineering‑driven operations focused on automation, Infrastructure‑as‑Code (IaC), observability, and toil reduction. * Serve as the L3 escalation for complex incidents; continuously improve platform run posture and readiness for L1/L2 execution. ## Core responsibilities: * Platform reliability (hybrid cloud + on‑prem): Own L3 reliability posture; define SLOs/KPIs; lead operability gates and production readiness; maintain runbooks/SOPs. * Automation & IaC: Design/build operational automation (health checks, remediation workflows); develop Terraform/Ansible configurations; script with Python (preferred), PowerShell, and/or Bash; integrate with ITSM for auditable self‑service and controlled remediation. * Incident/problem/RCA (L3): Lead diagnosis, stabilization, and recovery for major incidents; drive problem management, RCA, preventive actions; reduce MTTR/MTTD via better signals, runbooks, and automation. * Observability standards: Define actionable signals, alert quality, dashboards, logging; tune alerting to reduce noise; run data‑driven operational reviews. * AIOps enablement: Advance predictive/proactive operations (anomaly detection, trend/capacity analysis); support Python‑based analytics and ML/DL where applicable; industrialize operational intelligence safely. * Provider enablement (outsourced L1/L2): Equip provider with clear runbooks, training, standard changes, escalation criteria; govern performance and ITSM alignment; drive continuous improvement. * Collaboration & CI: Partner with Platform Engineering to ensure operable‑by‑design capabilities; feed operational insights into roadmap; mentor peers and promote engineering‑led operations culture. ## Required qualifications: * 5+ years in platform/SRE/operations/platform engineering with production ownership in large‑scale environments. * Hands‑on hy

Free ATS check

Applying for this Platform SRE role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Td Synnex?

Real rants from real employees. Read before you apply.

Read Company Rants →