As Ascend’s first Site Reliability Engineer, you will have the responsibility and opportunity to design, construct, and scale our infrastructure services. You will establish, own, and evangelize infrastructure and engineering best practices. You will progressively maintain the availability, latency and reliability of our platform. You will pull from your experience to develop creative solutions to overcome obstacles that Ascend faces and you will work proactively to prevent system failures.
What you’ll do:
Own production Kubernetes infrastructure, while designing to support hundreds of individual deployments
Design the systems we will use to deploy, monitor and scale software in production
Advocate best practices around logging, monitoring, build and test systems.
Own the performance and reliability of infrastructure services through load and failure testing
Write high-quality code (Python, Scala and Go) and author tools that automate everything you can
You will troubleshoot issues across complex stacks
System configuration, Infrastructure provisioning, deployments, monitoring, and incident response in production environments
You are willing to “carry the pager” but prefer building systems stable enough that you rarely get paged
What we look for:
Must have a strong understanding of building and managing large-scale systems and application architectures
Concrete understanding of systems and application design, and their operational impacts
Expert understanding of system performance and monitoring
Strong understanding of containers and container orchestration; experience with Kubernetes a plus
Experience with infrastructure services such as Docker, Hadoop, Kafka, Spark, ElasticSearch, MongoDB, and Cassandra
Strong experience with python; experience in Scala, Go, or NodeJS a plus
Minimum 7+ years in DevOps/SRE roles in large scale environments
BS or MS degree in Computer Science or related field
Experience operating a production service on AWS and GCP
Experience with security in the cloud: Intrusion, penetration, and vulnerability scanning
Good working knowledge of build automation and continuous integration; familiarity with Bazel and CircleCI a plus
In addition, there are some characteristics that we look for in all members of the Ascend team:
Polymaths: we believe the best teams are those comprised of individuals who have demonstrated excellence in a variety of technical areas, love sharing knowledge with their team, and actively seek out environments where they are not the smartest person in the room.
Context seekers: we love the 5 Whys technique. Why? The more you ask, the more context you have about our market, our company, our mission, and our product. Why (do we care)? With more context, you make more well informed decisions by yourself. Why (is this important)? It means less process, less management overhead, and most importantly, more personal growth for you.
Impact thinkers: the biggest challenge we face each day is how best to invest our time. The best people we've worked with have this remarkable ability to take a step back and identify how they, and the company, can have the largest impact. They are the ones who view the design of the wheel as sufficient, and passionately pursue the innovations that most profoundly affect our users.
Foundation builders: every team, when designing technology, tools, or process, must decide for which horizon to build. Do you design with the next few months, years, or decades in mind? We believe great engineers build for horizon n+1, while designing for n+2. Great engineers find that incredible balance between simplicity and extensibility, initial and recurring time investments.
What we offer:
The usual: free food, lots of equity, fancy Palo Alto office, etc... you know, the usual. We don't believe this is why you should join a company, however. We do these things simply to take good care of the people who take such great care of our company.
The unique: day 1. Day 1 is when you look at everything done before, leverage exceptional innovations of past, and rethink the rest. At Ascend, you get the opportunity to design things from the ground up and set the direction for years to come.
How we build:
Environment: we run everything in Docker images running on Kubernetes, within Google Container Engine (GKE). Why? Because after getting over the initial learning curve, we've found Kubernetes to be incredibly powerful, with a vibrant community that is driving the product forward at an incredible pace.
Testing & Deploy: our continuous deployment pipeline gates all changes on 100% test pass rate. Why? We believe technical debt is a slippery slope and people always underestimate the impact of shortcuts. It really is better to do things right from the beginning.
Services & APIs: we like both TDD and DDD (Doc Driven Development). We like API Driven Development (acronym TBD) even more. Why? In early stage companies things change very quickly. Leveraging environments like Kubernetes allows us to easily spin up services that expose clean APIs for particular purposes. We like a microservice approach for its modularity, speed of development and maintainability. Each of our services exposes REST and or/go gRCP APs that adheres to a well-thought-out spec, ensuring that if we need to change implementation strategies, the work is well contained.
Comms: Slack. It's really just that good.