TensorWave
Engineering
StaffDatabaseEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Staff Database Engineer at TensorWave. Skills: Database architecture, Database reliability, Performance engineering, Observability, Automation. Design database architecture. Own database architecture”
Industry & Context.
Troubleshooting; Root cause analysis; Performance tuning; Lock contention troubleshooting; Replication lag troubleshooting; Storage I/O bottleneck troubleshooting; Connection exhaustion troubleshooting; Schema design problem troubleshooting; Database capacity constraint troubleshooting; Backup failure troubleshooting; Restore failure troubleshooting
What They're Looking For.
Must Have
8+ years production database engineering, 8+ years database administration, 8+ years database architecture, PostgreSQL production experience, MySQL production experience, Percona production experience, Highly available database systems design, Highly available database systems operation, Replication experience, Failover experience, Backup experience, Restore experience, Disaster recovery validation experience, Deep SQL performance tuning, Execution plan analysis, Index design, Query rewrite, Schema optimization, Lock contention troubleshooting, Storage analysis, I/O analysis, Linux systems knowledge, Production incident support, Root cause analysis, Build database monitoring, Improve database monitoring, Build database observability, Improve database observability, Work across infrastructure teams, Work across DevOps teams, Work across platform teams, Work across application engineering teams, Define standards, Influence architecture, Mentor other engineers
Nice to Have
SlurmDBD experience, Slurm accounting databases experience, HPC infrastructure database workloads experience, AI infrastructure database workloads experience, NetBox experience, Infrastructure source-of-truth platforms experience, Percona XtraDB Cluster experience, ProxySQL experience, Advanced MySQL architectures experience, Advanced Percona architectures experience, PostgreSQL HA tooling experience, PostgreSQL replication architectures experience, Prometheus experience, Grafana experience, PMM experience, Splunk experience, eBPFCC experience, perf experience, strace experience, Low-level Linux diagnostic tooling experience, SaaS database support experience, Cloud database support experience, HPC database support experience, AI infrastructure database support experience, Large multi-tenant platforms database support experience, MongoDB experience, Oracle experience, SQL Server experience, Database automation experience, Ansible experience, Terraform experience, CI/CD systems experience, Internal tooling experience, Zero-downtime migrations experience, Major version upgrades experience, Production database consolidation experience
What You'll Do.
Design database architecture
Own database architecture
Define standard database patterns
Establish database design standards
Operate production database environments
Improve production database environments
Own database system lifecycle
Provision database systems
Configure database systems
Upgrade database systems
Design replication topology
Tune database performance
Validate database backups
Test disaster recovery
Decommission database systems
Create database documentation
Create database runbooks
Troubleshoot production database issues
Resolve production database issues
Drive root cause analysis
Convert findings into improvements
Serve as senior database engineering owner
Support Slurm database architecture
Improve Slurm database architecture
Support Slurm accounting data
Improve Slurm accounting data
Support Slurm job history
Improve Slurm job history
Support Slurm reporting queries
Improve Slurm reporting queries
Support Slurm performance
Improve Slurm performance
Support Slurm retention strategy
Improve Slurm retention strategy
Support Slurm database scaling
Improve Slurm database scaling
Support Slurm recovery
Improve Slurm recovery
Support Slurm long-term reliability
Improve Slurm long-term reliability
Support NetBox performance
Support source-of-truth systems performance
Plan database lifecycle
Validate backup and restore
Review integration patterns
Ensure database-backed systems scale
Build deep database observability
Maintain visibility into query performance
Maintain visibility into execution plans
Maintain visibility into index usage
Maintain visibility into replication health
Maintain visibility into locking behavior
Maintain visibility into buffer efficiency
Maintain visibility into cache efficiency
Maintain visibility into storage latency
Maintain visibility into connection pool behavior
Maintain visibility into OS-level bottlenecks
Create performance baselines
Create alerting standards
Identify database failure patterns
Build preventive monitoring
Build operational guardrails
Create database automation patterns
Integrate automation with tooling
Automate database provisioning
Automate database configuration standards
Automate backup verification
Automate health checks
Automate replication checks
Automate user management
Automate permission management
Automate upgrade workflows
Automate monitoring deployment
Automate runbook-driven recovery
Contribute database modules to Ansible
Contribute database roles to Ansible
Contribute database workflows to Ansible
Contribute database modules to CI/CD
Contribute database roles to CI/CD
Contribute database workflows to CI/CD
Contribute database modules to internal platforms
Contribute database roles to internal platforms
Contribute database workflows to internal platforms
Define production database readiness standards
Act as technical lead for incidents
Support root cause analysis
Support cross-team coordination
Support impact analysis
Support corrective action plans
Support long-term remediation
Mentor engineers on database operations
Mentor engineers on SQL troubleshooting
Mentor engineers on HA design
Mentor engineers on incident response
Mentor engineers on performance analysis
Provide senior technical review
How You'll Work.
Team & Collaboration
Cross-functional partners; DevOps teams; Infrastructure teams; MLOps teams; Platform Engineering teams
Full Job Description
About TensorWave Our mission is simple: deliver seamless, secure, reliable, and resilient AI compute at scale. We've built a versatile cloud platform that eliminates infrastructure barriers, empowering builders to focus on innovation instead of fighting their stack. Because breakthrough AI should move at the speed of ideas, not infrastructure. About the Role We’re looking for a Staff Database Engineer to join our team during an exciting phase of growth. In this role, you’ll be responsible for database architecture, database reliability, infrastructure-adjacent database platforms, performance engineering, observability, automation, operational maturity, incident response, and engineering leadership, working closely with cross-functional partners to support business objectives while upholding our standards for excellence, collaboration, and impact. What You’ll Do DATABASE ARCHITECTURE & PLATFORM OWNERSHIP - Design and own database architecture for critical infrastructure and platform services, including PostgreSQL-backed internal platforms, Slurm accounting and operational databases, NetBox and infrastructure source-of-truth databases, custom internal applications and automation services, observability, inventory, and platform metadata systems, future database-backed control plane services. - Define standard database patterns for high availability, replication, failover, backup and restore, point-in-time recovery, performance baselining, capacity planning, upgrade lifecycle management, access control and operational security. - Establish database design standards for new internal platforms, including schema review, indexing strategy, query design, service ownership boundaries, and production readiness requirements. POSTGRESQL, MYSQL, AND DATABASE RELIABILITY - Operate and improve production database environments across PostgreSQL, MySQL, Percona, and adjacent systems. - Own the lifecycle of database systems, including provisioning, configuration, version upgrades,
Applying for this Staff Database Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about TensorWave?
Real rants from real employees. Read before you apply.