Working with gigs of data and million row data sets is child's play. You will have complete autonomy and ownership to develop, enhance, and maintain our data design, algorithms and distributed data infrastructure. Understanding business needs and translating data requirements will be critical to this role. We're not in the business of bogging you down with process, so you will have total freedom to develop and maintain as you see fit.
You are a data scientist and server architect with some engineering experience who is obsessed with big data and can't turn off your attention to detail. You are an expert in SQL, strong in NoSQL, and have familiarity with multiple database systems. You have a CS background and feel comfortable with Linux-based software development. You have a keen interest in telling stories with all the data available to you and can relate your work to the overall business.
Experience deploying & supporting scalable and high availability infrastructure
Experience deploying & supporting high-traffic environment
Experience with distributed map/reduce and related technologies: Hadoop/Hive/Pig, etc
Experience writing quality map/reduce/serde code using your favorite language to parse/process data streams
PhD in CS/Math/Statistics or equivalent experience
NICE TO HAVE:
Experience with NoSQL: MongoDB, CouchDB
Experience deploying & supporting SaaS in a cloud environment
Experience with machine learning software: Mahout, Weka, etc.
Experience with graph databases: Neo4J, InfoGrid, Infinite Graph
Experience with search and search related technologies: Solr, ElasticSearch, Endeca, etc.
Experience with data collection and aggregation frameworks like Flume, Scribe or similar
Experience with Key/Value stores: Tokyo Cabinet/Tyrant, Redis, Voldemort, Oracle BDB
Experience with monitoring tools such as Nagios, Ganglia, Monit. . .etc.
Key Responsibilities / Performance Requirements:
Develop algorithms to take into account hundreds of different attributes to build user profiles
Design, develop and support a map-reduce-based data processing pipeline
Provide engineering and design support for Hadoop build, configuration, monitoring and supportability
Design, develop and support user profiles and behavioral analysis via data-mining and machine-learning algorithms
Responsible for data accuracy, scalability and integrity
Innovate practical NoSQL solutions to conquer scalability and distributed data processing challenges
Define Hadoop configuration of baseline image for master node, slave node and data nodes
Automation to add data nodes as need
Define and set up monitoring infrastructure to support and optimize infrastructure
Qloo is an "inspiration engine". It provides its users with personalized suggestions for new things to do based on all that they love already. Qloo provides suggestions across 8 major categories of culture - music, film, TV, dining, nightlife, fashion, books and travel.
For further information see http://www.qloo.com
Qloo was formed in 2012 and is based in Nolita, NYC.