Barclays

Banking

ObservabilityServiceEngineer

Bengaluru, India FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Observability Service Engineer at Barclays. Skills: Observability, Monitoring, Site Reliability Engineering (SRE), Automation, Distributed systems, Cloud platforms, Containerization, Microservices architectures, DevOps, Scripting. Effectively monitor and maintain the bank’s critical technology infrastructure. Resolve more complex technical issues, whilst minimising disruption to operations”

What You'll Achieve.

Minimise disruption to operations; Improve the service to customers and stakeholders; Ensure optimal performance; Maintain stability and drive efficiency; Ensure issues are known when they occur; Achieve the goals of the business; Enhance operational efficiency and incident response

Industry & Context.

Banking

Problems you'll solve

Resolve more complex technical issues; Analysis of system logs, error messages and user reports to identify the root causes of hardware, software and network issues; Providing a resolution to these issues by fixing or replacing faulty hardware components, reinstalling software, or applying configuration changes; Identification and remediation or raising, through appropriate process, of potential service impacting risks and issues; Create solutions based on sophisticated analytical thought comparing and selecting complex alternatives.; In-depth analysis with interpretative thinking will be required to define problems and develop innovative solutions.; Adopt and include the outcomes of extensive research in problem solving processes.

What They're Looking For.

Must Have

Proficiency in Observability Tools, such as Datadog, Splunk, New-Relic, Dynatrace, etc., knowledge and experience in Windows and UNIX/LINUX/Windows platforms., Solid understanding of distributed systems, cloud platforms (Preferred AWS), containerization (Docker, Kubernetes), and microservices architectures., Work experience in IT operations, monitoring, Site Reliability Engineering (SRE), or a dedicated observability role., Experience on DevOps tools, such as Jenkins, Terraform, Chef, Ansible, etc. and CI/CD pipeline., Experience with scripting languages (Python, Bash, PowerShell) for automation and data manipulation., Able to Identify risks and implement Controls where necessary., Familiarity with database monitoring (SQL, NoSQL)., Knowledge of networking concepts and protocols.

Nice to Have

AWS or Any Cloud certification., Open Telemetry and Custom Telemetry Integration., Exposure to Service Now Event Management administration.

What You'll Do.

Effectively monitor and maintain the bank’s critical technology infrastructure

Resolve more complex technical issues

whilst minimising disruption to operations

Provision of technical support for the service management function

Develop the support model and service offering

Execution of preventative maintenance tasks on hardware and software

Utilisation of monitoring tools/metrics to identify

prevent and address potential issues and ensure optimal performance

Maintenance of a knowledge base

Analysis of system logs

error messages and user reports to identify the root causes of hardware

software and network issues

Providing a resolution to these issues by fixing or replacing faulty hardware components

reinstalling software

or applying configuration changes

monitoring enhancements

business continuity management

front office specific support and stakeholder management

Identification and remediation or raising

through appropriate process

of potential service impacting risks and issues

Proactively assess support activities implementing automations where appropriate to maintain stability and drive efficiency

Actively tune monitoring tools

and alerting to ensure issues are known when they occur

Resolve business critical issues

Advocate preferred solutions to improve monitoring

Contribute and architect the road map for the Observability Tools

Keep the stakeholders updated with the Latest Monitoring and Observability trends

Showcase leadership skills on implementing the same in defined timeline

Working closely with the SRE teams to reduce toil and manage Chaos Engineering

Leading and mentoring a group of skilled engineers

and alert optimization to enhance operational efficiency and incident response

How You'll Work.

Team & Collaboration

Collaborate with other areas of work, for business aligned support areas to keep up to speed with business activity and the business strategies.; Working closely with the SRE teams; Guide team members through structured assignments; Identify the need for the inclusion of other areas of specialisation to complete assignments; Train, guide and coach less experienced specialists

Communication Scope

Advise key stakeholders, including functional leadership teams and senior management on functional and cross functional areas of impact and alignment.; Keep the stakeholders updated with the Latest Monitoring and Observability trends

Process & Methodology

Plan resources, budgets, Manage and maintain policies, Deliver continuous improvements, Lead collaborative, multi-year assignments

Full Job Description

# **Job Description** **Purpose of the role** To effectively monitor and maintain the bank’s critical technology infrastructure and resolve more complex technical issues, whilst minimising disruption to operations. **Accountabilities** * Provision of technical support for the service management function to resolve more complex issues for a specific client of group of clients. Develop the support model and service offering to improve the service to customers and stakeholders. * Execution of preventative maintenance tasks on hardware and software and utilisation of monitoring tools/metrics to identify, prevent and address potential issues and ensure optimal performance. * Maintenance of a knowledge base containing detailed documentation of resolved cases for future reference, self-service opportunities and knowledge sharing. * Analysis of system logs, error messages and user reports to identify the root causes of hardware, software and network issues, and providing a resolution to these issues by fixing or replacing faulty hardware components, reinstalling software, or applying configuration changes. * Automation, monitoring enhancements, capacity management, resiliency, business continuity management, front office specific support and stakeholder management. * Identification and remediation or raising, through appropriate process, of potential service impacting risks and issues. * Proactively assess support activities implementing automations where appropriate to maintain stability and drive efficiency. Actively tune monitoring tools, thresholds, and alerting to ensure issues are known when they occur. **Vice President Expectations** * To contribute or set strategy, drive requirements and make recommendations for change. Plan resources, budgets, and policies; manage and maintain policies/ processes; deliver continuous improvements and escalate breaches of policies/procedures.. * If managing a team, they define jobs and responsibilities, planning for the department’s

Free ATS check

Applying for this Observability Service Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 56 detected · ranked by frequency

Automation ×5

Technical support ×3

Preventative maintenance ×3

Monitoring tools/metrics utilization ×3

System log analysis ×3

Error message analysis ×3

User report analysis ×3

Hardware, software, and network issue resolution ×3

Hardware component fixing/replacement ×3

Software reinstallation ×3

Configuration changes ×3

Capacity management ×3

Resiliency ×3

Business continuity management ×3

Front office specific support ×3

Stakeholder management ×3

Risk identification and remediation ×3

Alert tuning ×3

CI/CD pipeline management ×3

Data manipulation ×3

Observability ×2

Monitoring ×2

Site Reliability Engineering (SRE) ×2

Distributed systems ×2

Cloud platforms ×2

Containerization ×2

Microservices architectures ×2

DevOps ×2

Scripting ×2

Datadog ×2

Splunk ×2

New-Relic ×2

BEHAVIOURAL

Listen and be authenticEnergise and inspireAlign across the enterpriseDevelop othersCollaborateBuild and maintain trusting relationshipsInfluencingNegotiating

Role Details

Seniority mid

Work Mode No

Type FULL TIME

AI-Extracted Insights

Domain Areas

banking-technology-infrastructurefinancial-services-operations

Certifications

AWS or Any Cloud certification

How to Apply on Workday

Workday has a multi-step form — save your progress after every section.
"Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Barclays?

Real rants from real employees. Read before you apply.

Read Company Rants →