Overview
Título del trabajo: Entry Level Site Reliability Engineer – SRE
Compañía: IBM
Descripción de funciones: Introduction
At IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, lets talk.Your Role and ResponsibilitiesA career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.Seeking new possibilities and always staying curious, we are a team dedicated to creating the world’s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.IBM’s product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.As a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying the latest software updates & fixes.Your primary responsibilities include:
- 24×7 Observability: Be part of a worldwide team that monitors the health of production systems and services around the clock, ensuring continuous reliability and optimal customer experience.
- Cross-Functional Troubleshooting: Collaborate with engineering teams to provide initial assessments and possible workarounds for production issues. Troubleshoot and resolve production issues effectively.
- Deployment and Configuration: Leverage Continuous Delivery (CI/CD) tools to deploy services and configuration changes at enterprise scale.
- Security and Compliance Implementation: Implementing security measures that meet or exceed industry standards for regulations such as GDPR, SOC2, ISO 27001, PCI, HIPAA, and FBA.
- Maintenance and Support: Tasks related to applying security patches and upgrades, and collaborating with Product support for issue resolution.
Required Technical and Professional Expertise
- Fluent in English
- You must be available to work in a hybrid format, and will need to come to our office located in the América Free Zone – XRQQ+97V, C. Domingueños, Heredia, Costa Rica, 3 times a week
- System Monitoring and Troubleshooting: 1-3 years of experience in monitoring/observability, issue response, and troubleshooting for optimal system performance.
- Automation Proficiency: 1-3 years of experience in automation for production environment changes, streamlining processes for efficiency, and reducing toil.
- Linux: 1 to 3 years of experience working with Linux operating systems.
- Operation and Support Experience: 1-3 years of experience of experience in handling day-to-day operations, alert management, incident support, migration tasks, and break-fix support.
Preferred Technical and Professional Expertise
- Kubernetes/OpenShift: knowledge or experience of Kubernetes/OpenShift environments.
- Automation/Scripting: knowledge or experience of Ansible, Python, Terraform, and CI/CD tools such as Jenkins, IBM Continuous Delivery, ArgoCD.
- Monitoring/Observability: knowledge or experience crafting alerts and dashboards using tools such as Instana, New Relic, Grafana/Prometheus.
- DBA: Interest or experience configuring and maintaining SQL, NoSQL, and data streaming technologies (e.g. PostgreSQL, CouchDB, Redis, Kafka, Spark, etc.).
Ubicación: Heredia
Fecha del trabajo: Sat, 14 Sep 2024 22:59:47 GMT