KRAFTON

Technology

SeniorMLOpsEngineer

$85000–130000k ~AI est. Seoul, South Korea FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Senior MLOps Engineer at KRAFTON. Skills: MLOps, GPU infrastructure, Kubernetes, Platform engineering. Operate and enhance GPU infrastructure. Develop GPU platform for R&D/service teams”

Industry & Context.

Technology

Problems you'll solve

System-wide analysis; Root cause resolution; Structural improvement

What They're Looking For.

Must Have

8+ years experience, AI/ML cluster/platform design/build/operation, Kubernetes ML/GPU platform improvement, GPU utilization analysis/optimization, ML workflow understanding/platform operation, IaC/GitOps/CI/CD/observability for platform operation, System-wide issue analysis/resolution, Cross-team collaboration for platform requirements, Generative AI/LLM tool practical application, No travel restrictions

Nice to Have

NVIDIA GPU Operator/DCGM/MIG/MPS experience, Run:ai/Slurm/Kueue/Volcano experience, Next-gen GPU architecture operation/validation/optimization, KServe/Triton/Ray Serve/Kubeflow/Argo Workflows experience, Linux system/network/storage performance optimization, GPU/Cloud resource policy design/implementation/operation, GPU/Cloud cost/utilization/latency/throughput analysis

What You'll Do.

Operate and enhance GPU infrastructure

Develop GPU platform for R&D/service teams

Stabilize GPU infrastructure operations

Improve GPU infrastructure performance

Optimize GPU infrastructure resource efficiency

Automate GPU infrastructure operations

Design/build/operate Kubernetes ML/GPU platform

Implement observability and fault response

Establish GPU utilization/wait time/throughput/cost strategies

Reflect strategies in platform

Analyze GPU utilization

Propose GPU purchase/cloud/build/external strategies

Enhance ML platform and reproducible operations

Coordinate team requirements for common platform

Apply AI tools to improve operations

How You'll Work.

Team & Collaboration

Research/development/service teams; Cross-functional teams; Multiple teams

Communication Scope

Technical proposals

Full Job Description

우리는 게이머의 로망을 실현하기 위해, 누구도 가지 않는 길을 갑니다. 예상을 뛰어넘는 과감한 상상력과 기술로, 전 세계 팬들이 잊지 못할 세상을 만들기 위해 담대하게 도전하고 개척합니다. We pioneer the path to players' dreams. With bold imagination and breakthrough technology, we create unforgettable worlds for fans across the globe. 우리 팀(프로젝트)을 소개합니다. [AI Service 본부 비전] 크래프톤 AI Service 본부는 사내외 여러 분야와 협업하여 다양한 문제에 대한 AI 솔루션을 제공하며, 자체 딥러닝 연구를 통해 우리만의 서비스를 개발합니다. 그 방향성은 크게 네 가지입니다. - Production Cost Down : 게임 제작 공정에 딥러닝 기술을 적용하여 제작 효율을 높이고, 제작자들의 업무 경험을 혁신합니다. - New Way to Create : 생성형 AI를 포함한 다양한 딥러닝 기술로 게임 창작의 방식을 확장합니다. - Virtual Friends : 딥러닝 기반 Virtual Friend를 개발하여 게임 안팎의 새로운 사용자 경험을 만듭니다. - Unique, Endless Gameplay : 딥러닝을 통해 유저에게 매번 새로운 경험을 제공하는 게임 콘텐츠를 구현합니다. [Culture Fit] AI Service 본부는 다양한 배경을 가진 구성원들이 함께 일하며, 수평적이고 활발한 커뮤니케이션 속에서 문제를 해결합니다. 직급과 연차를 넘어 자유롭게 의견을 제시할 수 있으며, 여러 직군과의 협업을 통해 기술과 서비스의 접점을 함께 만들어갑니다. [팀 소개] KRAFTON MLSys & Ops 팀은 AI Service 본부 내 모델 개발과 서비스 적용을 위한 GPU 인프라와 ML 플랫폼을 설계·구축·운영합니다. 모델 학습/실험 환경, ML 파이프라인, 모델 서빙 인프라, GPU 클러스터 운영, 인프라 자동화와 관측성 체계를 함께 다루며, 게임 제작과 서비스에 필요한 AI 워크로드가 안정적이고 효율적으로 동작할 수 있도록 공통 플랫폼을 만들어갑니다. 이번 포지션은 이미 확보된 B300 125노드 기반 GPU infrastructure를 함께 운영·고도화하면서, 이를 연구/개발/서비스 조직이 안정적이고 효율적으로 사용할 수 있는 GPU platform으로 발전시키는 실무형 시니어 엔지니어 역할입니다. 또한 실무 경험에서 얻은 인사이트를 바탕으로 향후 GPU 추가 구매, 클라우드 병행, 직접 구축, 외부 인프라 활용 여부까지 포함한 GPU/Compute 운영 전략을 기술적으로 제안하고 실행에 참여합니다. 우리 팀과 함께할 미션을 소개합니다. 이번 포지션은 B300 125노드 기반 GPU infrastructure를 함께 운영·고도화하면서, 이를 연구/개발/서비스 조직이 안정적이고 효율적으로 사용할 수 있는 GPU platform으로 발전시키는 실무형 시니어 엔지니어 역할입니다. B300 125노드 기반 GPU infrastructure의 운영 안정화, 성능 개선, 자원 효율화, 운영 자동화에 직접 참여합니다. Kubernetes 기반 ML/GPU 플랫폼의 스케줄링, 멀티테넌시, 워크로드 격리, 쿼터, 관측성, 장애 대응 체계를 설계·구축·운영합니다. 학습/추론 워크로드 특성을 바탕으로 GPU 활용률, 대기 시간, 처리량, 비용 효율을 개선하는 운영 전략을 수립하고 실제 플랫폼에 반영합니다. GPU capacity planning과 사용률 분석을 바탕으로 향후 GPU 추가 구매, 클라우드 병행, 직접 구축, 외부 인프라 활용 여부에 대한 기술적 판단을 제안합니다. ML 플랫폼과 재현 가능한 운영 체계를 고도화하고, 여러 팀의 요구사항을 공통 플랫폼 관점에서 조율합니다. 이런 경험을 가진 분과 함께 성장하고 싶습니다! (필수요건) AI/ML 학습 또는 추론 워크로드가 동작하는 대규모 GPU 클러스터 또는 Kubernetes 기반 ML 플랫폼을 설계·구축·운영해본 경험이 있으

Free ATS check

Applying for this Senior MLOps Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 57 detected · ranked by frequency

Kubernetes ×4

MLOps ×3

Platform engineering ×3

GPU infrastructure operation ×3

ML platform operation ×3

Kubernetes scheduling ×3

Workload isolation ×3

Observability ×3

Fault tolerance ×3

Performance analysis ×3

Resource allocation ×3

Cost efficiency ×3

ML workflow automation ×3

Reproducible operations ×3

Root cause analysis ×3

System optimization ×3

Requirement gathering ×3

Technical proposal ×3

Infrastructure strategy ×3

GPU infrastructure ×2

IaC ×2

GitOps ×2

CI/CD ×2

NVIDIA GPU Operator ×2

DCGM ×2

MIG ×2

MPS ×2

Run:ai ×2

Slurm ×2

Kueue ×2

Volcano ×2

KServe ×2

Role Details

Experience 8–10 yrs

Level Senior

Work Mode Onsite

Type FULL TIME

Category it-infra

Salary Band 200k+

AI-Extracted Insights

Domain Areas

machine-learning-workflowsgpu-architecturecloud-computingcontainer-orchestrationdistributed-systemshigh-performance-computing

How to Apply on Greenhouse

Create a Greenhouse profile before applying — it saves time across multiple applications.
Upload your resume as a PDF; the parser handles it better than Word.
Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about KRAFTON?

Real rants from real employees. Read before you apply.

Read Company Rants →