Site Reliability Engineer

Reference: 15224 - Frisco, Texas, US

Site Reliability Engineer (SRE)


Spark is the Gearbox Software team behind SHiFT, our online services platform that serves millions of users every month across multiple gaming franchises. SHiFT is our one-stop-shop gaming services platform responsible for dozens of features gamers around the world depend on every day, from cross-play to friend presence, citizen science, dedicated server hosting, matchmaking, and much more. Spark is passionate about delivering features for our gaming partners that are relevant, dependable, and secure. We take pride in the stability of our platform and are always looking for ways to take that stability to new levels. Our team is agile with a commitment to seeing features go from desktop to production in minutes, not days.


To further drive our vision of premier stability and rapid feature delivery, we are looking for a mid-level security-focused DevOps Engineer to join our team. As a DevOps engineer on Spark, you will be responsible for assisting in the design and implementation of flexible cloud architectures with an automation-first emphasis. You will be challenged along the way to adopt the shared mentality that security is paramount and push for that philosophy to be actualized throughout the platform. You should be comfortable integrating multiple technologies together to form a single, coherent view of platform security awareness and compliance across an AWS Organization comprised of up to hundreds of accounts. You should have professional expertise in best practices and familiarity with recent trends in cloud security. When challenged with designing and implementing a new feature in the infrastructure, you are confident in both, ready to defend them in a room with other technical minds. You also recognize that the best designs come from collaboration, not dictation, and are willing to bring implementations to the table with an open mind.

Typical Day

Tl;dr: You will bedeeplyimmersed inAWSandTerraform; you will design and develop solutions that are thoroughly automated and compliant with today's security guidelines.

Your days will be filled with building solutions to technical challenges in security, automation, and governance for our AWS cloud platform. You will evangelize governance best practices, call out gaps in automation, and beimmensely concerned with security compliance. You will help manage and orchestrate each of these by leaning heavily on technologies likeTerraform,Bash, and a wide variety of AWS organizational governance services. On any given day, you should expect to spend at least 75% of your time actively developing new solutions; the rest will typically be a mixture of reviewing code from your colleagues, helping define new security policies, participating in design meetings, responding to ad-hoc requests, documentation, and self-development.

This position will require you to carry a company-paid mobile device and participate in 24/7 on-call rotations alongside your engineering colleagues. Don't worry though, our on-call experience doesn't suck.

Core Responsibilities:

  • Be a trusted voice in the evangelism of DevSecOps throughout the team, promoting both security and automation as being of equal importance from prototype to production
  • Champion discussions that define appropriate security policies throughout the platform
  • Collaborate with our growing team of engineers, helping to build best practices in automation
  • Design and develop software solutions to improve the security and automation of our cloud platform
  • Develop tooling that aids other engineers in AWS account onboarding and compliance
  • Mentoring junior engineers as needed
  • Participate in after-hours on-call support rotations

Must Have (the non-negotiable parts):

  • Expertisein AWS organizational management and orchestration (AWS Organizations, ControlTower, Service Catalog, StackSets, Security Hub, etc.)
  • Expertisein AWS security management and best practices (IAM, Guardrails, SPCs, GuardDuty, SecurityHub, etc.)
  • Proficiencyin Terraform and/or CloudFormation
  • Minimum of 3 years extensive hands-on experience in a wide variety of AWS technologies in a professional setting
  • Professional development experience withat leastone of: Go, Python
  • Excellent teamwork skills, flexibility, and ability to handle multiple tasks
  • Comfortable communicator, able to clearly detail designs and implementations on an individual level and in large group settings

Should Have (some wiggle room):

  • Experience with containers in a professional setting, preferably Docker
  • Experience in disciplined software engineering with a focus on development and implementation of highly scalable/available applications
  • Understanding of observability stack management (monitoring, alerting, structured logging, APM, etc.)
  • Hands-on experience developing and maintaining CI/CD pipelines, preferably in git/GitLab
  • Understanding of RESTful and Websocket based APIs
  • Bachelor's degree in computer science, related field, or equivalent training and professional experience

Now you're just showing off:

  • Any verifiable security certification (isc2, AWS security specialist, ethical hacking, security+, etc.)
  • Familiarity with OpenTelemetry / OpenSLO
  • Familiarity with Datadog / Honeycomb
  • Familiarity with Atlassian products (OpsGenie, JIRA, Confluence)
  • Experience working with developers in an agile environment
  • Experience in the games industry, preferably launching multiple online-enabled AAAs
  • Knowledge about Gearbox-owned IPs