Tether

FinTech

AIResearchEngineer(Kernel&InferenceOptimization)

$7500–12000k ~AI est. Roma, Lazio, Italy FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for experienced candidates.

The Brief

“AI Research Engineer (Kernel & Inference Optimization) at Tether. Skills: Model serving, Inference optimization, AI systems. Design model serving architectures. Deploy model serving architectures”

What You'll Achieve.

Deliver highly responsive performance; Deliver efficient performance; Deliver scalable performance; Enable high-throughput performance; Enable low-latency performance; Enable low-memory footprint performance; Enable scalable AI performance; Deliver tangible value

Industry & Context.

FinTech

Problems you'll solve

Identify and resolve bottlenecks; Overcome challenges; Latency optimization; Computational bottlenecks; Memory constraints

What They're Looking For.

Must Have

Deep expertise in model serving, Optimize model deployment, Optimize inference strategies, Optimize model serving pipelines, Optimize inference frameworks, Background in advanced model architectures, Hands-on, research-driven approach, Develop novel serving strategies, Develop novel inference algorithms, Test novel serving strategies, Test novel inference algorithms, Implement novel serving strategies, Implement novel inference algorithms, Engineering robust inference pipelines, Establish comprehensive performance metrics, Identify bottlenecks in production, Resolve bottlenecks in production, High-throughput AI performance, Low-latency AI performance, Low-memory footprint AI performance, Scalable AI performance, Optimize models for efficient serving, Integrate solutions on resource-constrained devices, Apply empirical research, Overcome challenges in model serving, Latency optimization, Computational bottlenecks, Memory constraints, Design robust evaluation frameworks, Iterate on optimization strategies, Push boundaries of inference performance, Push boundaries of system efficiency, Designing and optimizing high-performance inference engines, Handle massive models on GPU clusters, Deep understanding of Diffusion Models, Deep understanding of Vision Transformers, Understanding of Pruning, Understanding of Quantization, Understanding of Flash attention, Understanding of KV Cache, Understanding of Speculative Decoding (Eagle)

Nice to Have

Tensor Parallelism, Pipeline Parallelism, Expert Parallelism

What You'll Do.

Design model serving architectures

Deploy model serving architectures

Deliver high throughput

Optimize memory usage

Run pipelines efficiently

Run pipelines on resource-constrained devices

Run pipelines on edge platforms

Establish performance targets

Improve token response

Minimize memory footprint

Build inference tests

Monitor inference tests

Track performance indicators

Document iterative results

Compare outcomes against benchmarks

Validate performance across platforms

Identify test datasets

Prepare test datasets

Identify simulation scenarios

Prepare simulation scenarios

Evaluate model performance

Evaluate memory utilization

How You'll Work.

Team & Collaboration

AI model team

Communication Scope

English communication

Full Job Description

Join Tether and Shape the Future of Digital Finance At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of blockchain technology, Tether enables you to store, send, and receive digital tokens instantly, securely, and globally, all at a fraction of the cost. Transparency is the bedrock of everything we do, ensuring trust in every transaction. Innovate with Tether Tether Finance: Our innovative product suite features the world’s most trusted stablecoin, USDT , relied upon by hundreds of millions worldwide, alongside pioneering digital asset tokenization services. But that’s just the beginning: Tether Power: Driving sustainable growth, our energy solutions optimize excess power for Bitcoin mining using eco-friendly practices in state-of-the-art, geo-diverse facilities. Tether Data: Fueling breakthroughs in AI and peer-to-peer technology, we reduce infrastructure costs and enhance global communications with cutting-edge solutions like KEET , our flagship app that redefines secure and private data sharing. Tether Education : Democratizing access to top-tier digital learning, we empower individuals to thrive in the digital and gig economies, driving global growth and opportunity. Tether Evolution : At the intersection of technology and human potential, we are pushing the boundaries of what is possible, crafting a future where innovation and human capabilities merge in powerful, unprecedented ways. Why Join Us? Our team is a global talent powerhouse, working remotely from every corner of the world. If you’re passionate about making a mark in the fintech space, this is your opportunity to collaborate with some of the brightest minds, pushing boundaries and setting new standards. We’ve grown fast, stayed lean, and secured our place as a leader

Free ATS check

Applying for this AI Research Engineer (Kernel & Inference Optimization) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 37 detected · ranked by frequency

Inference pipelines ×6

Performance metrics ×4

Evaluation frameworks ×4

Model serving ×3

Inference optimization ×3

Model deployment ×3

Inference strategies ×3

Serving pipelines ×3

Model architectures ×3

Serving strategies ×3

Inference algorithms ×3

Bottleneck identification ×3

Bottleneck resolution ×3

AI performance ×3

Model optimization ×3

Resource-constrained devices ×3

Empirical research ×3

Latency optimization ×3

Computational bottlenecks ×3

Memory constraints ×3

Optimization strategies ×3

Inference performance ×3

System efficiency ×3

Inference engines ×3

GPU clusters ×3

Diffusion Models ×3

Vision Transformers ×3

Pruning ×3

Quantization ×3

Flash attention ×3

KV Cache ×3

Speculative Decoding ×3

Role Details

Seniority Senior

Level experienced

Work Mode Remote

Type FULL TIME

Education bachelor's degree

Category software

Salary Band 200k+

AI-Extracted Insights

Domain Areas

digital-financestablecointokenizationblockchain-technologyaipeer-to-peer-technologydata-sharingdigital-learning

ANONYMOUS · UNFILTERED

What do employees actually say about Tether?

Real rants from real employees. Read before you apply.

Read Company Rants →