Site Reliability Engineer - Palo Alto
Altiscale - Palo Alto, CA

This job posting is no longer available on Altiscale. Find similar jobs:Site Reliability Engineer jobs - Altiscale jobs

The ideal candidate is a technical generalist with skills ranging from software development, to hardware configuration, network design, and production systems/network management.


Build, manage and support Altiscale’s Infrastructure Services covering one or more of the following:

Big Data Component Deployment

Tools Development

Hardware Build Out

Network Operations

On call rotation

Gain deep application-level knowledge of Big Data Components (such as Hadoop, HDFS, HBase, Hive, Pig) as well as contributing to their implementation and optimization on our Infrastructure

Deploy hybrid systems in which AWS, OpenStack, and bare-metal Hadoop clusters interoperate

Create processes and tools to enhance scalability, availability, management and maintainability of our systems

Create metrics to measure all aspects of operation of our service, in order to maximize utilization without compromising SLAs

Drive the decision-making process to improve the current and next generation hardware and software systems; collaborate with Research and Development teams to deliver high performing and reliable services

Skills & Requirements

A proven track record of setting up and maintaining very large server deployments

Depth of experience in integrating complex system of servers, networks, clients, and data-center operations

Solid knowledge of the following technologies and products

Linux (RPM-based family), including automation on this platform

Apache Hadoop, HDFS, HBase, Hive and Pig

Configuration management and packaging systems such as Chef, Cobbler, Yum

Monitoring tools including Ganglia and Nagios


Software development skills in two or more of the following programming and scripting languages: Java, C++/C, Ruby

A systematic, test-and-measure approach to continually improving service operations

Excellent communication and collaboration skills. Experience working closely with development, QA and product management

Excellent analytical and prioritization skills in order to manage several product/component launches at the same time

5+ years of experience

BA/BS in computer science and engineering (or equivalent experience); advanced degree is a plus