Site Reliability Engineer/DevOps Engineer - Digital Transformation Team

Site Reliability Engineer/DevOps Engineer hired

Description & Responsibilities

The Digital Transformation Team is looking for an expert in Site Reliability/Production Engineering, who will oversee and support the development of the different digital platforms coordinated by the Team.

You will be responsible for:

Managing the life cycle of the infrastructure services of application platforms (development, production and divestiture). Planning the platform monitoring, identifying relevant metrics to guarantee a high level of reliability of the infrastructures
Designing and implementing cloud-based infrastructures on the requirements of the stakeholders
Producing technical specifications to design cloud infrastructures
Automating processes to improve the scalability and reliability of the application platforms
Activities for security hardening on the cloud infrastructures
Identifying and prioritizing the technical debt to eliminate
Identifying and proposing alternative technologies to develop implementations with a higher degree of scalability
Coordinating activities to solve complex technical issues
Providing reliable resource plans for the development of the infrastructures
Developing automated tests to validate the source code
Collaborating with colleagues and stakeholders to develop and maintain processes of disaster recovery
Writing postmortem documents and technical reports about issues and malfunctions
Promoting and sharing the devops culture in the Team and in the public sector community

We’re looking for a talented professional who is passionate in developing and managing complex IT infrastructures, with a proven track record in the development of digital platforms and with a strong scientific and technical background.

Key Qualifications

Good knowledge of Linux, IT security practices and of fundamental notions of networks
Practical experience in the public cloud field (Google Cloud, Azure or AWS)
Practical experience in the field of cloud Open Source technology, specifically OpenStack
Work experience in agile contexts
Solid experience in coding and scripting (Python/Bash)
Experience with scheduling technologies for containers like Kubernetes or Docker Swarm
Experience with modern systems of logging like ElasticSearch, Graylog or Fluentd
Solid experience with monitoring processes applying technologies like Graphite or Prometheus
Familiarity with the principles and philosophy of DevOps, demonstrating a strong aptitude to reduce the operative overburdening of systems, through the implementation of automated processes
Experience in the design and development of scalable and solid software architectures
Work experience with standard tool of project management (Gantt), and with agile ones (Scrum or Kanban)
Motivated, innovation oriented, curious and open-minded attitude

Education

MS in Computer Science or related field with at least 3 years of experience in the IT industry as Site Reliability/ Production Engineering, or, in the absence of a degree, +5 years of experience in the IT industry as Site Reliability/ Production Engineering
Proficiency in English