Site Reliability Engineer/DevOps Engineer hired

Description & Responsibilities

The Digital Transformation Team is looking for an expert in Site Reliability/Production Engineering, who will oversee and support the development of the different digital platforms coordinated by the Team.

You will be responsible for:

  • Managing the life cycle of the infrastructure services of application platforms (development, production and divestiture). Planning the platform monitoring, identifying relevant metrics to guarantee a high level of reliability of the infrastructures
  • Designing and implementing cloud-based infrastructures on the requirements of the stakeholders
  • Producing technical specifications to design cloud infrastructures
  • Automating processes to improve the scalability and reliability of the application platforms
  • Activities for security hardening on the cloud infrastructures
  • Identifying and prioritizing the technical debt to eliminate
  • Identifying and proposing alternative technologies to develop implementations with a higher degree of scalability
  • Coordinating activities to solve complex technical issues
  • Providing reliable resource plans for the development of the infrastructures
  • Developing automated tests to validate the source code
  • Collaborating with colleagues and stakeholders to develop and maintain processes of disaster recovery
  • Writing postmortem documents and technical reports about issues and malfunctions
  • Promoting and sharing the devops culture in the Team and in the public sector community

We’re looking for a talented professional who is passionate in developing and managing complex IT infrastructures, with a proven track record in the development of digital platforms and with a strong scientific and technical background.

Key Qualifications

  • Good knowledge of Linux, IT security practices and of fundamental notions of networks
  • Practical experience in the public cloud field (Google Cloud, Azure or AWS)
  • Practical experience in the field of cloud Open Source technology, specifically OpenStack
  • Work experience in agile contexts
  • Solid experience in coding and scripting (Python/Bash)
  • Experience with scheduling technologies for containers like Kubernetes or Docker Swarm
  • Experience with modern systems of logging like ElasticSearch, Graylog or Fluentd
  • Solid experience with monitoring processes applying technologies like Graphite or Prometheus
  • Familiarity with the principles and philosophy of DevOps, demonstrating a strong aptitude to reduce the operative overburdening of systems, through the implementation of automated processes
  • Experience in the design and development of scalable and solid software architectures
  • Work experience with standard tool of project management (Gantt), and with agile ones (Scrum or Kanban)
  • Motivated, innovation oriented, curious and open-minded attitude

Education

  • MS in Computer Science or related field with at least 3 years of experience in the IT industry as Site Reliability/ Production Engineering, or, in the absence of a degree, +5 years of experience in the IT industry as Site Reliability/ Production Engineering
  • Proficiency in English

Last update: 01/20/2020
Back to topBack to top