Site Reliability Engineer at Acquia Inc. in Toronto, Canada

Acquia, is transforming the digital strategies of companies all over the world with our open cloud platform. We are passionate and relentlessly committed to helping our clients create digital experiences that are more relevant, personalized, and built for a fast-changing, always-connected, mobile-first world. Headquartered in the US, we have been named as one of North America’s fastest growing software companies as reported by Deloitte and Inc. Magazine, and have been rated a leader by the analyst community and named one of the Best Places to Work by the Boston Business Journal. We are Acquia. We are building for the future of the web, and we want you to be a part of it.

Be a part of a small team of Site Reliability Engineers who work within Engineering to build and operate Acquia’s PaaS/SaaS products such as Acquia Cloud, Content Hub and Lift. The successful candidate will have a tremendous ability to affect change while working on deep technical challenges using the latest cloud technology from Amazon Web Services.

Responsibilities

  • Work with team to implement highly-available and scalable architectures for core and third-party components of Acquia’s PaaS/SaaS products;
  • Solve availability/performance problems and build software-based solutions to prevent recurrences;
  • Guide and implement build pipelines and automated deployments;
  • Implement metrics, monitoring, and incident response processes;
  • Implement change management and capacity planning processes;
  • Initiate automated production deployments for patches and features;
  • Champion the needs of Operations and the Customer Support team;
  • Be aware of operations-related issues affecting Acquia’s PaaS/SaaS systems;
  • Monitor levels of manual effort and signal when it grows;
  • Measure availability metrics and signal when under SLA;
  • Share a 24/7 on-call rotation with development engineers;
  • Contribute as part of a Scrum team to maintain a deep understanding of system functionality and architecture, with primary focus to operational aspects of the service (availability, performance, change management, emergency response, capacity planning, etc)

Requirements & Qualifications

Requirements:

  • BS in Computer Science or a comparable field of study, or equivalent practical experience.
  • Experience working with one or more of: Ruby, PHP, Java, Javascript, Go, Python
  • Experience with Unix/Linux systems administration using the CLI.
  • Fundamental understanding of TCP/UDP networking concepts
  • Solid oral and written communications skills.

Preferred Qualifications:

  • Experience building systems on cloud technology (AWS, GCE, Rackspace, Openstack)
  • Understanding of Software Development Life Cycle, Test Driven Development, Continuous Integration, and Continuous Delivery.
  • Experience with gathering/analysing App/Host performance metrics
Stop reading, start applying! Apply

Top