Job Description
As a Site Reliability Engineer (SRE) you will ensure our customers get the best quality of service and up-time we can give them.
- Identify where we can expect and how we can tolerate IT failures from our systems as well as those we depend upon.
- Responsible for the availability, performance, monitoring, and incident response, and general service management, of the platforms and services that our company runs and owns.
- Work closely with our developers and architects to build and run services and systems that respond consistently to failures by gracefully degrading our services and help to ensure they are thinking about operational deliverables such as monitoring, logging, run books which can be make or break for diagnosing and fixing critical issues.
- Responsible for ensuring the systems and applications we launch remain available, reliable and efficient at accomplishing their duties even as their duties scale and evolve.
- Involved in every part of our site, from conceptions of products and their development to deployment, troubleshooting and analysis.
- Design, build and automate tools and processes to ensure and improve scalability, availability and performance across areas of technology. In addition, build, integrate and run tools to inject, predict and identify infrastructure and service failures on an ongoing basis to help optimise our sites.
- Ability to work independently, meet deadlines, and handle out of office on-call support
Qualifications
- Experience with AWS services, Docker/Kubernetes, and CI/CD pipelines (e.g., Jenkins, GitHub Actions)
- Proficiency in infrastructure as code (preferably Terraform) and configuration management (preferably Ansible)
- Strong UNIX/Linux systems administration and networking background
- Experience with database administration (MS SQL, AWS RDS, Elasticsearch) and web technologies (Apache/Nginx)
- Programming skills in Python, Bash, or Java
- Familiarity with monitoring tools (e.g., Splunk, New Relic, AWS CloudWatch)
- Understanding of IT Service Management (ITIL) in an agile DevOps environment
- Degree or equivalent qualification desirable (ideally in Computer Science, Mathematics, Engineering or a similar discipline)
- Excellent problem-solving skills and passion for automation and quality
- Strong communication skills and ability to work effectively in a team environment
- Experience implementing scalable software systems and platforms
#LI-Onsite #LI-MN1
Additional Information
Join us to unlock benefits and opportunities that will boost your career journey in a vibrant, inclusive and fulfilling work environment – so you can #BeYourself
Wellbeing@Rank is important... From hybrid working and colleague support networks to menopause support and weekly PepTalks, we’re here for you.
We’ll also invest in your growth by providing development opportunities, leadership training and cutting-edge industry certifications so you have the tools and resources to help you work, win and grow with us.
Immerse yourself in new cultures and gain international exposure through our global business. Collaborate with colleagues from around the globe.
From pensions to bonus schemes, and private medical insurance to life insurance – we've got you covered.
*Our benefits vary by brand and/or location. Please have a chat with your local Talent Acquisition specialist to find out what’s in place in your location.
The Rank Group are committed to being an inclusive employer, ensuring that we better understand and meet the needs and requirements of our candidates and customers.
We aim to do this by facilitating fair and equal access to our services. If you require a reasonable adjustment to be made, please reach out to let us know ahead of your interview.