Quad Analytix is an exciting, early stage data and SAAS startup in the Bay Area-CA, focused on harnessing e-commerce information gathered from a variety of sources to normalize, analyze and create insights that are visualized by our customers to enable them to do their job more effectively.
If you love data, want an opportunity learn and grow with our company, and are excited about seeing a project and your work directly impact the entire company, Quad might be a great fit for you. We have an exciting environment with smart, down-to-earth people that enjoy having fun while working together.
We are looking for someone who can help advance our platform of acquiring and structuring data from the internet and populate it in a production environment, in particular around extraction and normalization of attributes. We want you to be excited about constantly experimenting with new approaches while being dedicated to creating scalable, operationally friendly production systems.
As a Senior Data Extraction Engineer you will work with the product, development, and operations teams to understand requirements, articulate an approach, and design and execute both working prototypes and production systems
DESIRED SKILLS AND QUALIFICATIONS:
• At least 3 years of experience with Information Extraction: Detailed knowledge of and experience with data mining and entity/attribute extraction from a variety of unstructured data sources
o Utilizing AI technologies such as NLP, Ontology and Taxonomy Management, Text Processing with Lucene, Stanford, or other parsers
• Working knowledge of Hadoop and Machine Learning/Mahout and Big Data noSQL persistence and retrieval
• Experience with Data acquisition: Significant experience with Web/Image/Social data extraction and mining by implementing industrial-strength, distributed but polite Crawlers. Ideally, this includes experience or familiarity with disciplines such as distributed crawling, wrapper generation, and information extraction/retrieval technologies.
• Focus on programming languages of Java, Perl with nice to have familiarity of at least one of Python, Ruby, PHP
• Comfort with multiple Data Formats: XHTML/HTML5, XML/Parsers, RDF, JSON etc.
• Bachelors in Computer Science
• Strong communication skills and ability to work effectively in teams
• Intellectually curious, with passion for learning and growing professionally
• Strong work ethic and proactive approach to problem solving
• Enjoy having fun at work, and desire to collaborate with smart, humble people every day