Site Reliability Engineer (London)

Responsibilities
  • Monitor the stability and performance of the website
  • Remotely troubleshoot and diagnose hardware problems
  • Debug issues with Linux software, applications and network
  • Resolve technical challenges encountered in LAMP technologies
  • Develop and maintain monitoring tools and automation systems
  • Predict and respond to utilization variances across multiple data centers
  • Identify and triage all outage related events
  • Facilitate communication, coordinate escalation, and work with subject matter experts to implement critical fixes
  • Automate and streamline processes
  • Track issues and run reports

Requirements

  • 2-3 + years Linux support/sys admin experience in an Internet operations environment
  • BA/BS in Computer Science or a related field, or equivalent experience
  • Working knowledge of Linux, Cisco, TCP/IP, Apache and mySQL
  • Experience working with network management systems and monitoring tools, such as Nagios, Ganglia and Cacti
  • Competency in Shell, PHP, Perl or Python. C is a plus
  • Solid understanding of web services architecture and commonly employed technologies
  • A sense of urgency in responding to and resolving critical issues that relate to the performance of the site and/or core infrastructure
  • Excellent verbal and written communication skills
  • Willingness to work on-call
To become part of sucessful team apply now!