Synechron
Site Reliability Engineer (SRE) with AWS, Oracle, and Automation Expertise
Job Summary
Synechron is seeking an experienced Site Reliability Engineer (SRE) to enhance the stability, resilience, and operational maturity of our critical Financial Crime and Transaction Monitoring platforms. This role is vital in embedding SRE best practices across observability, automation, incident management, and production support. The successful candidate will be responsible for proactively managing service health, reducing operational risks, and supporting regulatory-critical services, thereby enabling the organization to deliver reliable, scalable, and compliant solutions aligned with business objectives.
Software Requirements
Required:
Strong understanding and hands-on experience managing production-grade systems with high reliability and availability requirements
Expertise in SRE principles, monitoring, logging, alerting, and defining SLOs/SLA tuning
Proficiency with AWS services including EC2, S3, RDS, VPC, IAM, and CloudWatch (latest versions or equivalents)
Linux system administration and troubleshooting skills for enterprise environments
Experience with Oracle databases, including performance tuning, RAC, or RMAN in large data environments
Automation scripting skills using Python and Shell (Bash/sh) for operational automation
Experience with monitoring tools such as Prometheus, Grafana, ELK/EFK, and PagerDuty
Familiarity with CI/CD tools like Jenkins, GitLab CI, or AWS CodePipeline
Preferred:
Knowledge of OFSAA, Oracle Rules Engine, or ML-enabled platform support (e.g., TRACE)
Infrastructure-as-Code tools such as CloudFormation or Terraform
Experience with support for high-performance Oracle environments (performance tuning, RAC, RMAN)
Exposure to cloud-native and containerized environments (Kubernetes, Docker)
Overall Responsibilities
Improve the reliability, availability, and recoverability of Financial Crime and Transaction Monitoring platforms.
Define, monitor, and manage SLIs/SLOs to proactively ensure service health and detect anomalies.
Provide Level 1 and Level 2 support for AWS and Oracle-based platforms, handling incident resolution and root cause analysis.
Build and sustain automation solutions for monitoring, logging, alerting, and operational workflows to reduce manual toil.
Lead incident response activities, conduct post-incident reviews, and implement preventative measures.
Develop, operate, and enhance CI/CD pipelines and infrastructure automation across environments.
Collaborate with engineering teams to design scalable, resilient, and secure systems; participate in capacity planning and performance tuning.
Support deployment, patching, and configuration changes, ensuring compliance with policies and standards.
Maintain comprehensive documentation of operational procedures, configurations, and incident resolutions.
Lead continuous process improvements to enhance system reliability, operational efficiency, and compliance adherence.
Technical Skills (By Category)
Systems & Support (Essential):
Enterprise-level system operation and support for AWS and Oracle environments
Linux system administration and troubleshooting
Incident management and escalation procedures
Monitoring & Automation (Essential):
Monitoring and alerting using Prometheus, Grafana, ELK/EFK, CloudWatch
Automation scripting with Python and Shell for operational tasks and event handling
Cloud & Infrastructure (Preferred):
Cloud deployment, scaling, and management (AWS, Azure, GCP)
Infrastructure-as-Code (Terraform, CloudFormation)
Databases/Data Management (Essential):
Oracle database management, performance tuning, and recovery
Data extraction and validation for high-volume transactional data
Development Tools & Methodologies (Essential):
Jenkins, GitLab CI, AWS CodePipeline for CI/CD pipelines
Version control with Git
Experience Requirements
Minimum of 8+ years supporting high-availability, mission-critical enterprise systems, particularly in financial services or comparable regulated environments.
Proven experience supporting Oracle databases, Oracle RAC, or RMAN in a high-volume context.
Strong background in enterprise support for Financial Crime and Transaction Monitoring platforms.
Demonstrated ability to lead operational support teams, manage incident escalations, and implement automation solutions.
Experience in cloud-native architecture, infrastructure automation, and observability tools.
Support experience working under regulatory and audit constraints is preferred.
Day-to-Day Activities
Monitor platform dashboards, logs, and alerts to ensure system health and performance.
Troubleshoot and resolve incidents related to operational, performance, or security issues proactively.
Conduct root cause analysis, document incident reports, and lead corrective action plans.
Automate routine operational tasks, alerts, and workflows to improve efficiency.
Collaborate with platform engineers, developers, and security teams on change management and capacity planning.
Participate in on-call rotations, incident reviews, and readiness exercises.
Continuously evaluate and recommend tools, procedures, and automation that improve reliability and reduce manual intervention.
Maintain detailed documentation of configurations, procedures, and lessons learned.
Qualifications
Bachelor’s degree in Computer Science, Engineering, or a related discipline.
8+ years supporting enterprise-scale, high-availability systems with operational excellence focus.
Experience supporting regulatory-critical platforms in financial services, especially in Fraud, Risk, or Transaction Monitoring.
Certifications in cloud platforms (AWS Certified Solutions Architect, Azure) and SRE foundations (Google SRE or equivalent) are advantageous.
Proven track record of automation, incident management, and operational improvements.
Professional Competencies
Critical thinking and analytical skills to diagnose and resolve complex operational issues.
Leadership and team management skills to guide operational teams and support team development.
Effective communication for stakeholder reporting, incident updates, and cross-team collaboration.
Ability to work under pressure, prioritize multiple tasks, and meet strict SLAs.
Adaptability to evolving technology landscapes and regulatory requirements.
Focus on continuous improvement, automation, and operational excellence.
SYNECHRON’S DIVERSITY & INCLUSION STATEMENT
Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.
Candidate Application Notice

