Senior Engineer, Observability at Auth0
United States of America
We are looking for a Senior Engineer to join the Observability area of our Infrastructure team to help us take it to the next level.
What does the Infrastructure team do?
We value ownership and innovation, and we build our teams with that in mind. We want each team to be responsible and accountable for what they ship. We also don't want to reinvent the wheel every time, so we try to get alignment in terms of practices and technologies. Our philosophy to achieve this is relying on excellent tooling and automation over policies and processes. We aim to provide internal tools and services that other teams want to use to make their life easier when shipping their features.
Today the Infrastructure Services team provides the following services to other engineering teams:
Storage Services: MongoDB, ElasticSearch, Postgres, Dynamo, backups and restores, etc.
Networking Services: VPCs, Load Balancers, Service Discovery, etc.
Observability Services: EKK (instead of Logstash we use AWS Kinesis), Datadog, our logging/metrics SDK, etc.
Release Management: CI, CD, feature flags
As we continue to grow, we are creating new teams that own more specific parts of Auth0. To encourage and simplify operational ownership, we will be splitting our existing services into smaller, more decoupled ones that individual teams will own. At the same time, we want to allow new teams that are forming to quickly be able to go from development to production in a reliable way and following recommended practices.

What are we doing next?

      • From an observability perspective, having multiple teams and multiple services means two things:
      • Educating engineers about what to log, measure and alert on.
      • Providing easy ways to understand the state of the system at a given point in time, including the ability to trace requests across multiple services.
In our case, this means:
    • Providing request tracing capabilities across different components (think Zipkin, AWS X-Ray).
    • Maintaining our current EKK stack (or any other tool and infrastructure) that allows teams to search their logs when troubleshooting issues.
    • Having teams learn about which metrics and events are important and why, through guidance, documentation and internal talks.
    • Providing a single platform to create dashboards and alerts. Today, in our public cloud we do this with our logging/metrics SDK, Datadog and Pagerduty. For our appliances we use InfluxDB, Telegraf and Pagerduty.
    • Providing tools and a platform for measuring and reporting availability for individual services.

What will you be doing?

    • You will be an engineer on this team. This means:
    • Designing and implementing features and bug fixes for the services the team owns
    • Optimizing performance and availability of our EKK stack to ensure everything is fast and ready for other teams
    • Working with other engineering teams to help them measure the health and performance of their systems with custom metrics and tools they can use to generate their own alerts, dashboards, and reports
    • Help ensure we can present availability and health metrics for different purposes (internal evaluations, change regression, customer dashboards, and more)
    • Being part of the Infrastructure team on-call rotation

You'd be a good fit if:

      • You have very good knowledge about a variety of infrastructure and general development topics, technologies, and tendencies
      • You have worked in an environment that runs multiple services owned by different teams, where there are multiple deployments a day, to services handling a large number of transactions per second
      • Have worked on observability-related projects before, or has a strong interest in that area
      • You are a great communicator
      • You enjoy thinking about how to make life simpler for other engineers
      • You love and advocate for customers
      • You are in a timezone with at least 2 hours of overlap with 11am-8pm UTC (most current team members are in UTC-3 and are generally early risers)
It’s not expected that a single candidate has expertise in all these areas. We’re looking for professional engineers, who can quickly learn and adapt as our systems and situation changes, rather than candidates with a rigid skill set.
You can learn more about our hiring process here.
Auth0 is an Equal Employment Opportunity employer. Auth0 conducts all employment-related activities without regard to race, religion, color, national origin, age, sex, marital status, sexual orientation, disability, citizenship status, genetics, or status as a Vietnam-era special disabled and other covered veteran status, or any other characteristic protected by law.
Auth0 participates in E-Verify and will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S.