As a Site Reliability Engineer (SRE) at Coupons.com, Inc. (CI) you will be working to improve the reliability and performance of our services. You will work shoulder-to-shoulder with our engineering teams to design and build the next generation of web applications and systems infrastructure, focusing on automation, availability and performance. A thorough understanding of System administration is a must, and specific experience with Linux required.
Work closely with engineering team to design, build, and maintain systems.
Represent Operations team and contribute towards new and ongoing technology projects in areas of Scalability, Performance and High Availability.
Review entire environment and execute initiatives to reduce failures, defects and improving overall performance.
Design, develop and execute automated tests to validate solutions and environments.
Troubleshoot issues across the entire stack - hardware, software, application and network.
Document current and future configuration processes and policies.
Perform troubleshooting analysis and implement fixes to ensure availability SLAs are met.
Take part in a 24x7 on-call rotation.
Experience with web server configuration, monitoring, trending, network design and high availability.
Command of your favorite scripting language: Python, Perl, Ruby, Bash, Java, C++, Powershell, etc. to automate tasks and gather data.
At least 3 years of experience with Linux systems administration.
Excellent oral and written communication skills; including documentation.
3+ years of hands on operational experience in a high-volume or critical production service environment.
Familiarity with systems management tools (Puppet, Chef, Capistrano, etc).
Require limited supervision and direction; drive results and set priorities independently.
Ability to handle multiple complex tasks, with tight deadlines concurrently.
Knowledge of Red Hat, Centos, Ubuntu.
Hands on operational experience in a high-volume or critical production service environment.
Experience with any enterprise monitoring systems like Nagios or Systems Center is highly desired.