Zeal Holdings Logo

Zeal Holdings

Senior Site Reliability Engineer (SRE)

Posted 22 Days Ago
Be an Early Applicant
3 Locations
Senior level
3 Locations
Senior level
The Senior Site Reliability Engineer (SRE) at Zeal Group will enhance the reliability and performance of infrastructure by designing robust systems, optimizing CI/CD pipelines, automating tasks, and managing incident responses. This role includes mentoring engineers and collaborating closely with development and DevOps teams to ensure high availability and fault tolerance of services.
The summary above was generated by AI

Description

About Zeal Group

Zeal Group is an award-winning FinTech organisation offering a variety of products. Founded in 2017, we have grown to a team of 700+ employees across the globe 🌎

Our offices and presence are spread across Europe, Asia, North & South Africa, Middle East and South America, with our Technology hub located in Cyprus 🚀

We are a product and people focused company who are passionate about growth, innovative technology, and collaboration 🙌🏼
About the Role

We are looking for a Senior Site Reliability Engineer (SRE) to join our engineering team and help drive the reliability, scalability, and performance of our infrastructure. As a Senior SRE, you will play a key role in architecting and maintaining highly available systems, optimizing our CI/CD pipelines, automating repetitive tasks, and ensuring seamless deployment and observability for our services. Your contributions will have a direct impact on our development velocity, service uptime, and overall customer satisfaction. Our team of SRE engineers is fully responsible for the infrastructure in the clouds and its fault tolerance and performance. To support the development and their pipelines, we have a separate DevOps team that helps them.
Responsibilities:

  • System Design & Architecture: Collaborate with software engineers and DevOps to design and implement resilient and scalable systems, focusing on high availability, fault tolerance, and disaster recovery.
  • Automation & Infrastructure as Code: Develop and maintain infrastructure automation scripts and tools using Terraform, Ansible, or similar technologies, ensuring reproducibility and consistency across environments.
  • CI/CD Pipeline Optimization: Build and enhance CI/CD pipelines to accelerate deployment speed and reduce time to market, including implementing blue-green or canary deployments where applicable.
  • Monitoring & Alerting & Logging: Create, manage, and refine monitoring dashboards and alerting systems using tools like Prometheus, Grafana, ElasticSearch to proactively detect and address potential issues before they impact customers.
  • Incident Management & Troubleshooting: Lead incident response efforts, perform root cause analysis, and implement long-term fixes to prevent reoccurrence, ensuring a fast, reliable response to production issues.
  • Performance Tuning: Conduct regular performance testing and tuning, identifying bottlenecks in infrastructure performance and system resources.
  • Mentorship & Leadership: Guide and mentor other team members, sharing best practices and helping to build a culture of reliability and performance within the engineering organization.
Requirements
  • 5+ years of experience in SRE, DevOps, or a similar role, with a proven track record of managing large-scale, distributed systems.
  • Strong knowledge of Linux/Unix systems and networking fundamentals.
  • Proficiency in at least one programming or scripting language (e.g., Python, Go, Bash).
  • Experience with containerization and orchestration (Docker, Kubernetes).
  • Hands-on experience with infrastructure as code (IaC) tools such as Terraform, Ansible.
  • Familiarity with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK).
  • Knowledge of technology for storing and delivering secrets to microservices (Hashicorp Vault)
  • Cloud Expertise: Experience with cloud platform (GCP) and understanding of cloud-native architectures.
  • Problem-Solving Skills: Strong analytical and problem-solving skills, with a focus on building automated, scalable solutions to complex challenges.
  • Collaboration: Ability to work cross-functionally with engineering, product, and support teams, with excellent communication and collaboration skills.

Technology Stack:

  • CDN providers: Akamai, EdgeNext
  • Cloud Platform: GCP
  • Orchestration: Kubernetes
  • CI/CD: GitLab, ArgoCD
  • IAC: terraform, ansible
  • Event streams: Kafka, RabbitMQ
  • Logging: ElasticSearch, Kibana, filebeat, logstash
  • Monitoring: Prometheus/VictoriaMetrics, Grafana, AlertManager, PagerDuty.
  • Secret Management: Hashicorp Vault, External Secret Operator
  • Artifactory: Sonatype Nexus
  • Object storage: GCS, minio

Top Skills

Bash
Go
Python

Similar Jobs

13 Hours Ago
2 Locations
Senior level
Senior level
Information Technology
As a Senior React Developer at Devtech, you will engage in the full software development lifecycle, focusing on writing high-quality JavaScript with modern front-end frameworks. Responsibilities include designing user-facing features, solving development challenges, and promoting a culture of quality and security while participating in team activities.
Top Skills: JavaScriptReactTypescript
14 Hours Ago
2 Locations
Senior level
Senior level
Other
As a Senior React Developer at Devtech, you'll be responsible for front-end development, involving writing high-standard JavaScript code, creating user-facing features, and participating in design and code reviews. You'll also engage in promoting security and quality within the company while reporting to your engineering team lead.
Top Skills: ReactTypescript
19 Hours Ago
4 Locations
Entry level
Entry level
Gaming • Software • Consulting • Esports
As an AQA Engineer, you will develop and maintain automated testing frameworks, perform manual testing, identify test cases for automation, and collaborate with developers. You will ensure high-quality software delivery while looking for continuous improvement in testing processes within an Agile/Scrum environment.
Top Skills: CypressSelenium

What you need to know about the Chennai Tech Scene

To locals, it's no secret that South India is leading the charge in big data infrastructure. While the environmental impact of data centers has long been a concern, emerging hubs like Chennai are favored by companies seeking ready access to renewable energy resources, which provide more sustainable and cost-effective solutions. As a result, Chennai, along with neighboring Bengaluru and Hyderabad, is poised for significant growth, with a projected 65 percent increase in data center capacity over the next decade.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account