Data Scientist, PhD
Cloudtropolis - Atlanta, GA

Looking for experienced quants, data scientists, and hackers. This is a unique opportunity to leverage big data and advanced algorithms to produce a comprehensive and multifaceted view of the Earth using weather data, agricultural statistics, geological surveys, and satellite images. You will own the measurement and quality standards and evaluation of weather and geological data used in products and backend models.

What you will do:
  • Use statistical methods to measure the accuracy of our critical data feeds and develop reusable machine learning algorithms to flag anomalies
  • Publish data summary metrics and classifications to the rest of the company through metadata annotations and dashboards
  • Making a measurable impact on the company performance by delivering high quality scalable products
  • Discovering stories told by the data and presenting them to others through rich visualizations
  • Working on formulation, implementation, testing and validation of predictive models
  • Performing assembly of modeling data sets from multi-terabyte structured and unstructured data repositories
  • Being responsible for a specific data sciences product or component of a production system
  • Participating in a full development cycle from product inception, research and prototyping to release in production
  • Writing production quality code while implementing your own ideas
  • Implementing efficient automated processes for producing modeling results at scale
  • Working closely with product and engineering teams and interacting with other teams on a regular basis
  • Work side-by-side with engineers to operationalize measurement quality algorithms at scale
  • Build your own custom software tools that facilitate ad-hoc data exploration and visualization
  • Quickly iterate solutions and algorithms over multidimensional data, with frequent visualization

  • MS or PhD in a quantitative discipline (e.g., statistics, computer science, physics)
  • 5 years of hands-on experience in analysis and modeling of large complex datasets
  • A passion for innovating with data sciences at scale – applying modern algorithms to massive datasets and creating measureable business value
  • Track record of successful implementations of quantitative modeling products in a business environment
  • Understanding of statistical modeling and machine learning in a practical context
  • Understanding of algorithms, scalability and various tradeoffs in a Big Data setting
  • Ability to put together a system of disjoint components that implements a working solution to the problem
  • Ability to communicate quantitative analysis in a clear, precise, and actionable manner
  • Proficiency with unsupervised and semi-supervised learning, and anomaly detection algorithms
  • Experience with geostatistical methods, time-series analysis, or spatiotemporal datasets
  • Proficiency writing well-structured, easily maintainable, well-documented code in at least one scripting language
  • Proficiency with at least one numerical/statistical computing environment
  • Expertise in using Hadoop or MPP databases (e.g., Netezza, Vertica) for complex data assembly and transformation
  • Familiarity with scientific data formats is a plus