Wells Fargo
SystemsOperationsManager
Neural analysis suggests this role is
optimal for Manager candidates.
“Systems Operations Manager at Wells Fargo. Skills: Production support, SRE principles, Automation. Lead L2 production support. Manage L2 production support”
What You'll Achieve.
Reduce MTTR; Improve operational efficiency; Improve system reliability; Improve system scalability; Improve system resilience; Ensure SLA adherence; Ensure SLO adherence; Drive continuous service improvement
Industry & Context.
Troubleshooting; Problem-solving; Root Cause Analysis
What They're Looking For.
Must Have
production support experience, application operations experience, platform management experience, reducing MTTR track record, improving system uptime track record, observability tools experience, incident management platforms experience, SRE principles knowledge, automation tools experience, AI/automation in IT operations familiarity, ITIL processes knowledge, service management frameworks knowledge, analytical skills, troubleshooting skills, problem-solving skills, leadership abilities, communication abilities, stakeholder management abilities
Nice to Have
self-service solutions development, AI/ML-driven insights integration, predictive monitoring integration, anomaly detection integration, system reliability improvement, system scalability improvement, system resilience improvement, production releases oversight, deployments oversight, environment readiness oversight, SLA adherence, SLO adherence, continuous service improvement leadership, operations teams leadership, operations teams mentoring, operations teams development
What You'll Do.
Lead L2 production support
Manage L2 production support
Improve triage practices
Improve diagnostics practices
Improve incident resolution practices
Monitor system health
Monitor system performance
Monitor system availability
Own incident management processes
Improve incident management processes
Own problem management processes
Improve problem management processes
Own change management processes
Improve change management processes
Conduct Root Cause Analysis
Implement permanent fixes
Champion automation-first approach
Eliminate manual tasks
Improve operational efficiency
Develop self-service solutions
Promote self-service solutions
Integrate AI/ML-driven insights
Integrate predictive monitoring
Integrate anomaly detection
Collaborate with engineering teams
Improve system reliability
Improve system scalability
Improve system resilience
Oversee production releases
Oversee environment readiness
Drive continuous service improvement
Lead operations teams
Mentor operations teams
Develop operations teams
How You'll Work.
Team & Collaboration
Engineering teams; L1 teams; End users
Communication Scope
Stakeholder management
Process & Methodology
Incident management, Problem management, Change management
Full Job Description
* **Role Summary** The Systems Operations Manager leads Level 2 (L2) platform support for production applications, ensuring high availability, reliability, and performance. This role operates with an SRE (Site Reliability Engineering) and AIOps mindset, focusing on reducing MTTR, enabling self-service capabilities, and driving an automation-first approach to improve operational efficiency and resilience. **Key Responsibilities:** * Lead and manage L2 production support operations for critical applications and platforms * Drive reduction in **MTTR** through improved triage, diagnostics, and rapid incident resolution practices * Monitor system health, performance, and availability using observability and alerting tools * Own and improve **incident, problem, and change management** processes aligned with ITIL/SRE practices * Conduct **Root Cause Analysis (RCA)** and implement permanent fixes to prevent recurring issues * Champion **automation-first approach** to eliminate repetitive manual tasks and improve operational efficiency * **Required Skills & Qualifications:** * Strong experience in production support, application operations, or platform management * Proven track record in reducing MTTR and improving system uptime * Experience with **observability tools** (monitoring, logging, tracing) and incident management platforms * Solid understanding of **SRE principles** (SLIs, SLOs, error budgets, reliability engineering) * Hands-on experience with **automation tools** (scripting, orchestration, CI/CD pipelines) * Familiarity with **AI/automation in IT operations (AIOps)** and self-healing systems * Knowledge of ITIL processes and service management frameworks * Strong analytical, troubleshooting, and problem-solving skills * Excellent leadership, communication, and stakeholder management abilities * * Develop and promote **self-service solutions** (runbooks, knowledge bases, automated healing tools) for L1 teams and end users * Integrate **AI/ML-driven insights** , p
Applying for this Systems Operations Manager role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about Wells Fargo?
Real rants from real employees. Read before you apply.