Sr. Manager, Cloud Engineering

Job Locations US-MA-North Reading
Posted Date 1 month ago(12/18/2020 3:31 PM)
Job ID
Cloud Ops


TraceLink is seeking a Senior Manager of Cloud Engineering to lead our Site Reliability Engineering team.  This team will manage the development of tools and processes used to deploy and maintain the core infrastructure components supporting the TraceLink software offerings across the globe.  In this role you will be leading a team focused on building infrastructure as code, and working very closely with product engineering teams building platform and user applications.


  • Manage a distributed team of Cloud Engineers employing agile processes to build and release infrastructure tools coordinated with the overall product release timelines.
  • Manage team deliverables, backlog, interacting with other product teams, measuring progress across sprints and releases, as well as communicating progress and risks to relevant stakeholders. Set standards and provide requirements to engineering teams to deliver ops-ready software.
  • Work closely with key members of senior leadership as well as architecture and security teams to align technical direction with the overall TraceLink technical direction. Build upon TraceLink's goals of operational excellence, systems/infrastructure/security best practices and technical leadership.
  • In developing infrastructure as code, build and extend CI/CD infrastructure pipelines to provide metrics and visibility in support of reducing deployment errors and increase testing coverage, drive improvements in resource usage and reducing cost, as well as confirming scale and resiliency. Incorporate strong security practices throughout
  • Work closely with the on-call Cloud Operations team to develop methods to improve operational deployment, incident handling and reduce toil
  • Be an SRE/DevOps evangelist and subject matter expert for the broader TraceLink technical/developer community


  • Required Skills
    - Bachelor's degree in an engineering field
    - 3+ years experience as a Site Reliability Engineer or DevOps Engineer
    - 5+ years experience managing DevOps/SRE teams working to build infrastructure as code
    - Has built and managed geographically distributed teams working on a large-scale SaaS platform
    - Skilled in DevOps/SRE practices and build/release pipelines
    - Skilled with AWS services both from technology and cost perspectives
    - Strong understanding of cloud deployment and management practices
    - Strong knowledge of Linux and open source tools, their application in large-scale distributed systems
    - Hands-on experience with Terraform/Helm/Docker/Kubernetes/Prometheus/ELK/Bash/Python
    - Clear understanding of security best practices and how best to incorporate
    - Excellent communication skills, written and verbal
    - Strong analytical and problem solving skills
  • Preferred Skills
    - Advanced degree in an engineering field
    - Understanding of Java/JavaScript, reactive frameworks
    - Knowledge of compliance and security audit requirements and how to incorporate them into current and future practices


Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed