← all jobs

Staff Site Reliability Engineer — Project Volcano [Remote]

Work from home Full-time role Hiring

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Staff Site Reliability Engineer — Project Volcano in United States. We are seeking a highly experienced Staff Site Reliability Engineer to serve as the founding reliability leader for a next-generation internal developer platform. In this role, you will define and implement the reliability foundation for a greenfield, high-impact initiative that supports on-demand environments, edge deployments, and core platform services at scale. You will work closely with engineering leadership and cross-functional teams to design resilient infrastructure, establish SRE practices from the ground up, and ensure system reliability across complex distributed systems. This position offers significant technical ownership and visibility, with the opportunity to shape the operational backbone of a strategic platform initiative. It is ideal for a senior SRE who thrives in ambiguous environments and enjoys building foundational systems that enable large-scale engineering productivity. Accountabilities

  • Define and own end-to-end reliability for the platform, including SLOs, SLIs, error budgets, incident response frameworks, and operational best practices across all services.
  • Architect and implement multi-region Kubernetes-based infrastructure supporting edge deployments, backend services, and platform control planes.
  • Build and evolve GitOps-driven CI/CD pipelines and deployment systems using tools such as ArgoCD, Helm, Terraform, and Terragrunt.
  • Design and operate scalable, multi-tenant data systems including PostgreSQL clusters, caching layers, and object storage with a focus on resilience and performance.
  • Establish observability standards from inception, including monitoring, logging, alerting, dashboards, and runbooks using tools such as Datadog, Prometheus, and Grafana.
  • Partner with engineering, product, and security teams to integrate reliability, compliance, and operational excellence into platform architecture decisions.
  • Lead incident management, postmortems, and on-call practices while fostering a blameless, high-learning engineering culture.
  • Mentor engineers across teams on SRE principles, reliability engineering practices, and operational maturity.
  • Evaluate and adopt emerging technologies relevant to edge computing, serverless platforms, and modern distributed infrastructure.

Requirements

  • Bachelor’s degree in Computer Science or equivalent practical experience, with strong background at Staff or Principal-level SRE or Platform Engineering roles.
  • Proven experience building SRE practices or platform engineering foundations for developer platforms, SaaS, or PaaS environments, ideally in greenfield or early-stage settings.
  • Deep expertise in Kubernetes, including multi-tenant cluster architecture, networking (CNI, ingress, service mesh), scaling, and security hardening.
  • Strong experience designing and operating large-scale distributed systems with high availability and reliability requirements.
  • Hands-on expertise with infrastructure-as-code and GitOps tooling such as Terraform, Terragrunt, Helm, and ArgoCD.
  • Experience building and maintaining observability stacks using Prometheus, Grafana, Datadog, or similar tools.
  • Strong knowledge of cloud infrastructure, networking, and data systems including PostgreSQL, Redis, and object storage technologies.
  • Experience in incident management, postmortems, on-call operations, and reliability governance practices.
  • Ability to collaborate effectively across engineering, product, and security teams in fast-moving, ambiguous environments.
  • Strong technical leadership, communication, and mentoring skills with a track record of influencing engineering culture.

Benefits

  • Competitive compensation with opportunities for long-term incentives.
  • High-impact role with ownership of a strategic, greenfield platform initiative.
  • Remote-friendly work environment within the United States.
  • Opportunity to shape reliability engineering standards for a next-generation developer platform.
  • Exposure to cutting-edge technologies in Kubernetes, edge computing, and distributed systems.
  • Strong engineering culture focused on learning, collaboration, and continuous improvement.
  • Opportunity to work closely with senior technical leadership and cross-functional teams.
  • Career growth in a high-visibility, foundational platform engineering role.

How Jobgether works: We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your pe

More open positions

Site Reliability Engineer (Remote + Travel)

Work from home Full-time role

Immediate Hiring: Remote - Site Reliability Engineer/Production

Work from home Full-time role

Site Reliability Engineer; DevOps; Remote

Work from home Full-time role

Senior Site Reliability Engineer (Remote USA)

Work from home Full-time role

Site Reliability Engineer (FULLY REMOTE-Graveyard Shift)

Work from home Full-time role

Director, Global CS Service Provider

Work from home Full-time role

High‑Paying Remote Chat Support Specialist – No Degree Required, $25‑$35/hr Flexible Work‑From‑Home Career

Work from home Full-time role

[Remote] Clinical Project Manager III/ Senior (level dependent on experience)

Work from home Full-time role

Senior Clinical Research Associate - CNS/Oncology - Midwest - Remote

Work from home Full-time role

Remote Social Media Manager Junior Contract

Work from home Full-time role

Consultant Programme Policy (Expert Consultancy Contingency Preparedness)

Work from home Full-time role

[Remote] Senior Enterprise Account Executive - Orange County, CA

Work from home Full-time role

Compliance & Onboarding Specialist - Healthcare Staffing

Work from home Full-time role

Director, Direct Agent Sales

Work from home Full-time role

Associate Trader

Work from home Full-time role

Evening Gown & Cocktail Dress Seamstress – Alterations – O'Fallon, MO

Work from home Full-time role

[Hiring] Dental SaaS Solutions Consultant @Henry Schein One

Work from home Full-time role

Managed Service Account Lead

Work from home Full-time role

[Remote] Senior Account Manager

Work from home Full-time role

Senior C++ Software Engineer – Warehouse Automation & Robotics Integration

Work from home Full-time role

Full-Stack TypeScript Developer (Remote)

Work from home Full-time role