← all jobs

Senior Software Engineer, Machine Learning Inference Platform

Work from home Full-time role Hiring

About Stack: Stack is developing revolutionary AI and advanced autonomous systems designed to enhance safety, reliability, and efficiency of modern operations. Stack's autonomous technology incorporates cutting-edge advancements in artificial intelligence, robotics, machine learning, and cloud technologies, empowering us to create innovative solutions that address the needs and challenges of the dynamic trucking transportation industry. With decades of experience creating and deploying real world systems for demanding environments, the Stack team is dedicated to developing an autonomous solution ecosystem tailored to the trucking industry's unique demands. About the Role: In the Senior Engineer role, you will own meaningful subsystems of Stack AV's inference platform and drive them from design through production. You will be the go-to engineer for one or more areas such as model onboarding, serving APIs, metering, observability, performance optimization, or tenant isolation. The role requires strong hands-on implementation, production debugging, thoughtful design, and the ability to mentor engineers while keeping delivery moving. Responsibilities: Own technical design and delivery of subsystems in a high-throughput, low-latency inference platform capable of handling multi-tenant, enterprise-grade inference workloads. Develop robust API layers (gRPC, WebSockets, REST, etc.) and developer SDKs that abstract complex distributed inference orchestration into seamless, reliable token streams. Build and harden a multi-tenant control plane to enable accurate metering, rate limiting, quotas, tenant isolation and noisy-neighbor fairness across the platform. Optimize inference performance across the entire system stack, including the model engine layer. Build observability and SLOs to gain insights into system economics, cache-hit rates, GPU utilization and cost accounting per model and per tenant. Partner with product and infrastructure teams on model onboarding, capacity planning, external API contracts and customer adoption. Decompose ambiguous work, drive issues to closure, and raise the engineering bar through code quality, reviews, testing, and mentoring. Qualifications: Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field. Experience: 4+ years of experience building and operating backend distributed systems end to end. Strong Data & ML systems fundamentals: data-intensive distributed systems, concurrency, networking and performance profiling. Hands-on experience with large-scale inference services on GPUs, including KV caches, prefill/decode stages and throughput/latency trade-offs. Direct experience with inference engines (TensorRT, vLLM, etc) or serving frameworks (Dynamo, Triton or equivalent). Technical Skills: Strong programming skills in C++, Go, Rust or Python. Familiarity with deep learning frameworks (PyTorch, etc.) as well as model parallelism. Familiarity with GPU computing primitives such as CUDA, NCCL, NVLink, and hardware-specific optimizations. Practical understanding of high-performance networking architectures, including InfiniBand, RoCE, and low-latency cluster communication. Problem-Solving: Strong analytical and problem-solving skills. Autonomous vehicles (AV) experience is a bonus. We are proud to be an equal opportunity workplace. We believe that diverse teams produce the best ideas and outcomes. We are committed to building a culture of inclusion, entrepreneurship, and innovation across gender, race, age, sexual orientation, religion, disability, and identity. Check out our Privacy Policy. Please Note: Pursuant to its business activities and use of technology, Stack AV complies with all applicable U.S. national security laws, regulations, and administrative requirements, which can restrict Stack AV’s ability to employ certain persons in certain positions pursuant to a range of national security-related requirements. As such, this position may be contingent upon Stack AV verifying a candidate’s residence, U.S. person status, and/or citizenship status. This position may also involve working with software and technologies subject to U.S. export control regulations. Under these regulations, it may be necessary for Stack AV to obtain a U.S. government export license prior to releasing its technologies to certain persons. If Stack AV determines that a candidate’s residence, U.S. person status, and/or citizenship status will require a license, prohibit the candidate from working in this position, or otherwise be subject to national security-related restrictions, Stack AV expressly reserves the right to either consider the candidate for a different position that is not subject to such restrictions, on whatever terms and conditions Stack AV shall establish in its sole discretion, or, in the alternative, decline to move forward with the candidate’s application.

More open positions

Staff Software Engineer, Machine Learning Inference Platform

Work from home Full-time role

Senior Product Marketing Manager

Work from home Full-time role

Customer Service Representative - Billing II

Work from home Full-time role

Grupo QuintoAndar | Analista de Cobrança (For Rent)

Work from home Full-time role

Senior Marketing Manager, EMEA

Work from home Full-time role

Homemakers or Stay-at-Home Moms – Flexible Work from Home Roles

Work from home Full-time role

BCBA (Up to $25,000 Bonus)

Work from home Full-time role

Associate Customer Experience Representative – Philippines

Work from home Full-time role

Senior Director - Cloud Infrastructure & Platform Engineering

Work from home Full-time role

Director, Solutions Consulting (TripSource)

Work from home Full-time role

Experienced Inbound Customer Service Representative – Deliver Exceptional Member Experiences in a Dynamic Remote Environment

Work from home Full-time role

[Remote] Sales Engineer / Solution Architect – Critical Power Infrastructure

Work from home Full-time role

Primary Care Nurse Practitioner, 100% Virtual

Work from home Full-time role

Virtual/Personal Assistant (CoPilot)

Work from home Full-time role

Enterprise Cloud Platforms Engineer - Unix

Work from home Full-time role

[Remote] Sales Principal, New England

Work from home Full-time role

SAP HCM/SuccessFactors Consultant

Work from home Full-time role

Claims Major Case Director

Work from home Full-time role

Experienced Customer Service Representative – Join the careerzynith Team and Deliver Exceptional Travel Experiences

Work from home Full-time role

[Job - 29428] Arquiteto(a) Sênior de Dados — Governança, FinOps e Segurança

Work from home Full-time role

Conservation Botanist

Work from home Full-time role