About this role
Skills and Responsibilities:
• Role: SRE / DevOps Engineer focused on reliability, automation, and observability.
• Core Skill: Strong expertise in Dynatrace (APM, Davis AI, RUM, infra monitoring).
• AI Observability: Use Davis AI for root cause analysis, anomaly detection, predictive insights, and alert optimization.
• Monitoring Setup: Manage One Agent, Smart scape, distributed tracing, SLIs/SLOs, dashboards, and alerts.
• Automation: Build and maintain Ansible playbooks for deployments, provisioning, patching, and remediation.
• Cloud Expertise: Hands-on with AWS (primary) and Azure, including serverless and cloud-native architectures.
• Infrastructure as Code: Use Terraform, CloudFormation, or CDK for provisioning.CI/CD & Integration: Implement pipelines using Jenkins, GitHub Actions, GitLab CI, and integrate monitoring tools.
• Containers & Reliability: Work with Docker, Kubernetes, ECS/AKS, ensuring HA, DR, and incident response.
• Tech Stack: Proficient in Python (boto3), Bash, plus tools like CloudWatch, Azure Monitor; exposure to Prometheus, Grafana, ELK (optional).