Sr. Site Reliability Engineer at Aha!
Location: North America; 100% Remote
Candidates must be located in North America.
Aha! is a very different high-growth SaaS company. We are the world’s #1 product development software and help product development teams build what customers love. More than 5,000 enterprises use our suite of products which includes Aha! Create, Aha! Ideas, Aha! Roadmaps, and Aha! Develop. And they rely on our training programs at Aha! Academy to become product development experts. We are self-funded, highly profitable, always distributed, and have no sales team. Being an always-remote company means we can hire intrinsically motivated people who love to learn, support their teammates, and want to work from where they are happiest.
Aha! engineering is a mid-sized, fully remote team that is highly productive. We are centered around North American time zones so we can collaborate during the workday.
- We move quickly: We ship code multiple times a day. We believe in getting new features in front of customers and iteratively improving as we learn what works and what does not.
- We collaborate: We each bring unique experiences and skills to the table. Working together to share that knowledge benefits the entire team and helps us produce the best results for our customers.
- We value product over process: We want our team to have time to focus on solving complex challenges. We aim to minimize the overhead introduced by heavyweight processes and excessive meetings.
- We enjoy: We like what we do. And we want you to love your job too. Learn more about The Responsive Method, our company values, and the generous benefits we offer.
Our web application is a single-instance, multi-tenant Ruby on Rails monolith supported by Postgres (database), Redis (background jobs), and memcached (Rails caching). We also run a Node.js webserver to support collaborative editing and real-time updates. Our application is hosted on Amazon Web Services and architected with ECS for reproducibility and scalability.
We embrace new technologies that help us deliver a lovable product, but we also remain cognizant of the maintenance overhead that a new library or platform brings. We solve the problems in front of us rather than prematurely optimizing to address issues that may never materialize.
We do most of our planning and collaboration in Aha! Roadmaps. We built Aha! Develop so software engineers and their teams could take advantage of those same rich features. We use Slack and Zoom for video calls. (Email? Rarely.)
- You have experience developing SaaS products in Ruby on Rails
- You have 4+ years of experience operating a cloud-based SaaS product on AWS using terraform
- You have a passion for fast, efficient, and reliable services
- You have experience solving linux, networking, and database-level problems
- You are calm under pressure and respond methodically to anomalies or outages
We believe that being a kind person who elevates the rest of the team is just as valuable as writing great code. You have strong problem-solving skills and experience working on important functionality for a cloud-based product. You are humble, eager to learn, and always willing to help others learn as well. You want to work with people who enjoy picking up a problem and solving it, regardless of the technologies and techniques involved.
Your work at Aha!
Site Reliability Engineers at Aha! ensure the platform remains stable, reliable, and secure for the world’s biggest and most innovative companies. You will implement significant operations architectural features, contribute to supporting product developers, and consult with product developers whenever there is a concern about performance or scalability. Day to day, this will look like:
- Setting and monitor SLOs for the organization, working with product and engineering teams to ensure they are met
- Building and maintaining monitoring, observability, and autoscaling solutions for our own services, as well as those we purchase from AWS
- Writing and maintaining production runbooks and operations documentation
- Providing on-call operational support for production on a rotation
- Assisting platform engineers in building new infrastructure services, and consulting with application engineers to help build fast, reliable features
If the Sr. Site Reliability Engineer role sounds appealing, we would love to hear from you. (A real human reviews every application.)
Grow with us
Everyone deserves to reach their fullest potential. We know that when we do work that matters with people we care about in a high-growth environment, we feel engaged and alive. And our goal is to help you do just that. We offer all the benefits you would expect and more, including profit sharing. The specific benefits listed below are reflective of what we offer U.S.-based hires. We also do our best to extend identical benefits to international teammates.
- Generous salary with annual profit sharing for all
- Medical, dental, and vision plans — for many teammates, we cover 100 percent of the premiums
- Up to 200 hours of paid time off a year to spend however you want
- 30 to 90 days of paid parental leave and five to 10 days of paid care and bereavement leave
- Up to $1,000 annually for third-party education, along with paid time off to immerse yourself in learning
- Aha! contributes a percentage of your total compensation each year towards your retirement