Sr. Data Pipeline Engineer (Lambda Architecture)
TellApart - Burlingame, CA

This job posting is no longer available on TellApart. Find similar jobs: Senior Data Pipeline Engineer jobs - TellApart jobs

TellApart helps many of the world’s most successful retailers unlock the power of their customer data by applying the latest advances in cloud computing, predictive analytics, and machine learning. Our Customer Data Platform collects and analyzes massive amounts of data to power an integrated suite of marketing solutions that delivers personalized shopping experiences for 100s of millions of consumers in real time.

TellApart is backed by Greylock Partners and Bain Capital Ventures. Several of technology’s top executives have also invested in TellApart, including: Ron Conway (SV Angel), Dick Costolo (CEO, Twitter), Reid Hoffman (Founder & Chairman, LinkedIn), Jeff Jordan (former CEO, OpenTable), Phil Libin (CEO, Evernote) and Mike Walrath (former CEO, RightMedia). David Rosenblatt (former CEO, DoubleClick) is also an executive advisor.

Our team’s impact on the business:
The engineering team at TellApart creates big data solutions that collect and analyze petabytes of shopping data from some of the largest retailers in the world. Even though we're still a start-up, TellApart is currently one of the largest consumers of AWS services, with hundreds of active nodes in multiple regions managed through EC2 auto-scaling clusters that serve hundreds of thousands of requests with strict low latency requirements (under 40 milliseconds). TellApart’s infrastructure is built on a blend of java and python based web services, updated multiple times a day, and built following the lambda architecture: hadoop/cascading jobs managed with azkaban building out batch updates of voldemort servers while kafka queues feed into YARN/Spark streaming infrastructure for real-time updates.

Why we need you:
Play a key role in scaling our existing data pipeline to handle 10x the data to match our current growth trajectory

Build a real-time data pipeline that produces up-to-the-minute analyses

Consider next-generation hardware choices and configurations for our Hadoop clusters to optimize performance and reliability

Build or choose open source tools to help other engineers access complex data more efficiently

Most importantly, because you are excited by big data technologies and would like to make meaningful contributions toward the next advances in big data

Examples of projects recently undertaken by our team:
We needed to make it easy for any engineer to query customer data, so we created a framework on top of Cascading that exposes a uniform interface for extracting data from raw server logs

As the complexity of our job flow grew, we designed and built a reporting workflow using Azkaban running over a cost-efficient Hadoop cluster using Amazon’s Elastic MapReduce

We hack into open source projects to improve reliability and squeeze the last bit of performance out

Your qualifications:
Minimum of 5 years developing production software with previous experience working on a large-scale distributed system

At least 2 years of experience using big data systems and/or using big data for data analysis

Solid understanding of distributed system concepts used in scaling big data technologies with exponential growth of data and speeding up queries.

In-depth understanding of inner workings of as many of the technologies we currently use today (not all production tech listed): Hadoop, MapReduce, HBase, Voldemort, MongoDB, Cascading, Cloudera CDH, Spark, Parquet, Scalding, Kafka, Zookeeper, Eureka. etc.

Excellent analytical skills to deliver meaningfull and impact-driven insights using big data

Ability to thrive in a dynamic, fast-paced, collaborative, and high-growth start-up environment

Excellent communication skills, initiative, and teamwork

Bachelors or equivalent experience in Computer Science or related discipline (MS or PhD a plus)

Our data is unique, our platform is full of possibilities, and our business is booming. We are looking for motivated talent who want to take our technology and business to the next level and realize its full potential. If this sounds like your kind of environment, come join us at TellApart! Please apply and include any links to code samples (if possible/relevant). You can also reach out to our Recruiter, John Delaney (, if you have any questions.

TellApart is an Equal Employment Opportunity and Affirmative Action Employer with a commitment to workplace diversity. All qualified individuals are welcome to apply. Employment with TellApart is based solely upon one's individual merit and qualifications directly related to professional competence. We do not discriminate on the basis of race, religion, color, sex, age, national origin, citizenship, or disability. And we will make all reasonable accommodations to meet our obligations under the Americans with Disabilities Act (ADA) and state disability laws.