Tether
FinTech
AIResearchEngineer(Kernel&InferenceOptimization)
Neural analysis suggests this role is
optimal for experienced candidates.
“AI Research Engineer (Kernel & Inference Optimization) at Tether. Skills: Inference Optimization, Model Serving, Kernel Optimization, AI Research. Optimize model deployment. Optimize inference strategies”
What You'll Achieve.
Deliver tangible value; Improve token response; Minimize memory footprint; Validate performance across platforms
Industry & Context.
Identify and resolve bottlenecks; Diagnose bottlenecks
What They're Looking For.
Must Have
Deep expertise in model serving pipelines, Deep expertise in inference frameworks, Background in advanced model architectures, Hands-on, research-driven approach, Develop, test, and implement novel serving strategies, Develop, test, and implement inference algorithms, Engineering robust inference pipelines, Establishing comprehensive performance metrics, Identifying and resolving bottlenecks, High-throughput, low-latency, low-memory footprint, scalable AI performance, Optimizing model deployment, Optimizing inference strategies, Experience with resource-efficient models, Experience with complex, multi-modal architectures, Experience with text, images, and audio integration, Experience with Tensor Parallelism, Experience with Pipeline Parallelism, Experience with Expert Parallelism, Experience handling massive models on GPU clusters, Experience with Diffusion Models, Experience with Vision Transformers, Experience with Pruning, Experience with Quantization, Experience with Flash attention, Experience with KV Cache, Experience with Speculative Decoding, Experience with optimizing models for efficient serving, Experience integrating solutions on resource-constrained devices, Ability to apply empirical research to overcome challenges, Proficient in designing robust evaluation frameworks, Iterating on optimization strategies
Nice to Have
Experience with KEET app
What You'll Do.
Optimize model deployment
Optimize inference strategies
Deliver highly responsive performance
Deliver efficient performance
Deliver scalable performance
Work on resource-efficient models
Work on complex multi-modal architectures
Design model serving pipelines
Optimize model serving pipelines
Design inference frameworks
Optimize inference frameworks
Develop novel serving strategies
Test novel serving strategies
Implement novel serving strategies
Develop inference algorithms
Test inference algorithms
Implement inference algorithms
Engineer robust inference pipelines
Establish comprehensive performance metrics
Identify bottlenecks in production
Resolve bottlenecks in production
Enable high-throughput AI performance
Enable low-latency AI performance
Enable low-memory footprint AI performance
Enable scalable AI performance
Design state-of-the-art model serving architectures
Deliver high throughput
Optimize memory usage
Ensure pipelines run efficiently
Run pipelines on resource-constrained devices
Run pipelines on edge platforms
Establish clear performance targets
Improve token response
Minimize memory footprint
Build controlled inference tests
Run controlled inference tests
Monitor controlled inference tests
Track key performance indicators
Monitor response latency
Monitor memory consumption
Document iterative results
Compare outcomes against benchmarks
Validate performance across platforms
Identify high-quality test datasets
Prepare high-quality test datasets
Identify simulation scenarios
Prepare simulation scenarios
Evaluate model performance
Evaluate memory utilization
Analyze computational efficiency
Diagnose bottlenecks in serving pipeline
Monitor processing metrics
Monitor memory metrics
Address suboptimal batching issues
How You'll Work.
Team & Collaboration
AI model team
Communication Scope
English communication
Full Job Description
Join Tether and Shape the Future of Digital Finance At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of blockchain technology, Tether enables you to store, send, and receive digital tokens instantly, securely, and globally, all at a fraction of the cost. Transparency is the bedrock of everything we do, ensuring trust in every transaction. Innovate with Tether Tether Finance: Our innovative product suite features the world’s most trusted stablecoin, USDT , relied upon by hundreds of millions worldwide, alongside pioneering digital asset tokenization services. But that’s just the beginning: Tether Power: Driving sustainable growth, our energy solutions optimize excess power for Bitcoin mining using eco-friendly practices in state-of-the-art, geo-diverse facilities. Tether Data: Fueling breakthroughs in AI and peer-to-peer technology, we reduce infrastructure costs and enhance global communications with cutting-edge solutions like KEET , our flagship app that redefines secure and private data sharing. Tether Education : Democratizing access to top-tier digital learning, we empower individuals to thrive in the digital and gig economies, driving global growth and opportunity. Tether Evolution : At the intersection of technology and human potential, we are pushing the boundaries of what is possible, crafting a future where innovation and human capabilities merge in powerful, unprecedented ways. Why Join Us? Our team is a global talent powerhouse, working remotely from every corner of the world. If you’re passionate about making a mark in the fintech space, this is your opportunity to collaborate with some of the brightest minds, pushing boundaries and setting new standards. We’ve grown fast, stayed lean, and secured our place as a leader
Applying for this AI Research Engineer (Kernel & Inference Optimization) role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Tether?
Real rants from real employees. Read before you apply.