NBCUniversal
Media and Entertainment
StaffSiteReliabilityEngineer(CollaborationEngineering)
Neural analysis suggests this role is
optimal for mid candidates.
“Staff Site Reliability Engineer (Collaboration Engineering) at NBCUniversal. Skills: Site Reliability Engineering, Collaboration Engineering, Microsoft 365, AI Engineering. Apply engineering mindset to operations. Define service level indicators/objectives”
What You'll Achieve.
Reduce toil; Improve observability; Strengthen incident response; Ensure consistent experience; Maintain hardened security posture; Reduce operational toil; Reduce customer impact; Meet governance requirements; Improve consistency across teams; Drive measurable results
Industry & Context.
Systems thinking; Troubleshooting
Minimum four days in office, In-person interview
What They're Looking For.
Must Have
12+ years experience, Bachelor's degree in Computer Science/Engineering, Equivalent practical experience
Nice to Have
Microsoft Entra ID/Azure AD experience, Conditional Access experience, MFA experience, RBAC/PIM experience, Purview experience, DLP experience, Retention experience, eDiscovery experience
What You'll Do.
Apply engineering mindset to operations
Define service level indicators/objectives
Reduce toil through automation
Improve observability
Strengthen incident response
Ensure collaboration experience
Architect global Intune environments
Optimize global Intune environments
Architect global Jamf Pro environments
Optimize global Jamf Pro environments
Orchestrate Windows Updates for Business
Patch third-party applications
Maintain compliance policies
Automate packaging applications
Automate deployment applications
Maintain third-party updates cadence
Leverage PowerShell to automate tasks
Leverage Graph API to automate tasks
Leverage PowerShell for self-healing
Leverage Graph API for self-healing
Partner with Security Operations
Remediate vulnerabilities
Develop Configuration Profiles
Enforce Configuration Profiles
Develop Compliance Policies
Enforce Compliance Policies
Develop Conditional Access rules
Enforce Conditional Access rules
Own reliability of Azure Virtual Desktop
Own scaling of Azure Virtual Desktop
Optimize Azure Virtual Desktop performance
Optimize Azure Virtual Desktop cost
Own reliability of Windows 365
Own scaling of Windows 365
Optimize Windows 365 performance
Optimize Windows 365 cost
Define SLIs/SLOs for collaboration services
Operationalize SLIs/SLOs for collaboration services
Define error-budget policies
Operationalize error-budget policies
Measure customer impact
Own end-to-end reliability engineering
Perform capacity planning
Perform performance tuning
Conduct resilience reviews
Perform dependency mapping
Reduce proactive risks
Develop AI engineering capabilities
Operationalize AI engineering capabilities
Scale AI engineering capabilities
Manage AI model lifecycle
Automate AI engineering
Ensure AI reliability
Drive AI enterprise adoption
Establish guardrails for responsible AI
Implement AI data controls
Provide AI operational oversight
Build observability for collaboration platforms
Evolve observability for collaboration platforms
Define health dashboards
Establish telemetry standards
Develop alert strategy
Implement synthetic monitoring
Align monitoring to user experience
Lead incident response
Establish incident roles
Drive rapid mitigation
Coordinate cross-team communication
Produce blameless post-incident reviews
Implement durable corrective actions
Engineer automation to reduce toil
Automate provisioning
Automate policy drift detection
Automate configuration drift detection
Automate lifecycle management
Establish reusable runbooks
Establish self-service patterns
Strengthen change practices
Strengthen release practices
Conduct production readiness reviews
Manage controlled rollouts
Manage maintenance windows
Develop validation plans
Develop rollback strategies
Reduce customer impact
Partner with Security/Compliance
Ensure services meet governance requirements
Balance usability and reliability
Provide Staff-level technical leadership
Set engineering standards
Influence roadmap priorities
Align stakeholders on reliability tradeoffs
Align stakeholders on investment
Establish reliability operating mechanisms
Lead reliability operating mechanisms
Improve consistency across teams
Provide technical guidance
How You'll Work.
Team & Collaboration
Cross-functional teams; Cross-team communication; Cross-functional collaboration
Communication Scope
Executive communication; Verbal communication; Written communication
Process & Methodology
Roadmap planning
Full Job Description
NBCUniversal is one of the world's leading media and entertainment companies. We create world-class content, which we distribute across our portfolio of film, television, and streaming, and bring to life through our global theme park destinations, consumer products, and experiences. We own and operate leading entertainment and news brands, including NBC, NBC News, NBC Sports, Telemundo, NBC Local Stations, Bravo, and Peacock, our premium ad-supported streaming service. We produce and distribute premier filmed entertainment and programming through our powerhouse film and television studios, including Universal Pictures, DreamWorks Animation, and Focus Features, and the four global television studios under the Universal Studio Group banner, and operate industry-leading theme parks and experiences around the world through Universal Destinations & Experiences, including Universal Orlando Resort, home to Universal Epic Universe, and Universal Studios Hollywood. NBCUniversal is a subsidiary of Comcast Corporation. Visit www.nbcuniversal.com for more information. Our impact is rooted in improving the communities where our employees, customers, and audiences live and work. We have a rich tradition of giving back and ensuring our employees have the opportunity to serve their communities. We champion an inclusive culture and strive to attract and develop a talented workforce to create and deliver a wide range of content reflecting our world. The Staff Reliability Engineer (SRE) for Workplace Engineering is responsible for the reliability, performance, security, and operational excellence of enterprise workplace collaboration & endpoint services used globally by employees and partners. This role applies an engineering mindset to operations—defining service level indicators/objectives (SLIs/SLOs), reducing toil through automation, improving observability, and strengthening incident response—to ensure a consistent, high-quality collaboration experience across messaging, meetings
Applying for this Staff Site Reliability Engineer (Collaboration Engineering) role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on SmartRecruiters
- SmartRecruiters often includes a video screening step — check camera and mic permissions.
- Link your GitHub or portfolio directly in the profile section for technical roles.
- Applications may be reviewed by AI scoring before reaching a recruiter — use keywords from the job description.
ANONYMOUS · UNFILTERED
What do employees actually say about NBCUniversal?
Real rants from real employees. Read before you apply.