- Work closely with engineering and product management teams to design, and implement machine learning and statistical analysis systems to understand user behavior and build new products.
- Data exploration, hypothesis creation (from business and product goals), testing algorithms, scaling to large data-sets and validating results will be common tasks for this role
- Typical projects will include: analyzing user and social behavior and determining patterns in the data; and development of recommendation and classification systems related to design information and design processes
- We have a broad set of technologies with which the Data Scientist will work: Hadoop/HDFS; Shark/Spark; NoSQL databases, and numerous charting, graphing and analysis applications such as: Gephi, Google Charts, etc.
- Successful outcomes of the work of this Data Scientist will include: new product features; user and product behavior insights; and systems (software and infrastructure) behavior tracking.
- The Data Scientist on this group will work closely with senior engineers, architects and product managers to develop real world solutions to complex problems that can then be taken to production.
- Will have an opportunity to begin shaping the general Data Scientist role for the future of BlueKai
- Complete familiarity with various statistical and machine learning techniques including: classification, regression, dimension reduction, clustering and various multivariate methods.
- Practical experience implementing algorithms from the above areas
- Understanding of the orders of algorithms and their scaling behaviors
- MS or higher in the field of Statistics or Computer Science
- Ability to quickly learn new tools and understand new areas of data
- Can work closely with engineering teams to gather and process data as well as in surfacing various analytically based features in core products
- At least 8 years of work experience related to complex data problems where insight is derived using logical and statistical techniques from significant amounts of data
- Knowledge of the standard Hadoop/HDFS/Hive/Pig tools; and optionally some newer technologies such as: Spark/Shark; GraphLab/GraphChi; or Storm.
- Good coding skills covering some procedural as well as statistical or data oriented languages. (Such as: Java, Scala, Python as well as R, SQL, etc.)
- Good communication skills and an awareness of how to communicate data effectively
- Comfortable working in newly forming ambiguous areas where learning and adaptability are key skills
- Deep knowledge in accelerating and massively scaling various machine learning algorithms
- Scala and functional programming expertise
- Network visualization and Real-time analytics
BlueKai - 4 months ago