Cleveland, Ohio, 44113

Job is not accepting new candidates

Job description

Data Lead
Remote
$160,000-210,000

The Data Lead will play a crucial role in architecting and sustaining our data landscape, encompassing ETL pipelines, vector databases, and retrieval systems tailored for RAG-based applications. This position will oversee data quality, governance, and performance enhancement initiatives, ensuring our platform provides precise, scalable, and cost-effective data-driven solutions.

Responsibilities of the Data Lead

Data Engineering: Proficient in SQL and Python, with expertise in designing ETL workflows and normalizing/cleaning data.
Vector Databases & Retrieval: Experience with platforms like Pinecone, Weaviate, Milvus, or pgvector, and knowledge of indexing strategies such as HNSW, IVF, and PQ.
RAG (Retrieval Augmented Generation): Crafting retrieval methodologies including chunking, embedding selection, and re-ranking.
Embedding Models: Competence in selecting and assessing embedding models tailored for domain-specific applications.
Data Modeling & Knowledge Graphs: Familiarity with enhancing connections between structured and unstructured data (preferred but not essential).
Data Quality & Governance: Establishing benchmarks for metadata management, access controls, data lineage, and data freshness.
Performance Optimization: Assessing and tuning variables like latency, recall/precision, and balancing cost/performance ratios.

Requirements for the Data Lead

Over 6 years of experience in data engineering, data platform management, or related ML data roles.
Exceptional skills in SQL and Python for ETL processes and data manipulation.
Experience with vector database technologies like Pinecone, Weaviate, Milvus, and pgvector.
Demonstrated proficiency in developing retrieval pipelines for RAG applications.
In-depth knowledge of embedding models and their assessment criteria.
Awareness of data quality and governance principles.
Capacity to enhance systems for improved latency, accuracy, and cost-effectiveness.

#ZR

Data Lead

Cleveland, Ohio, 44113

Job description

Similar Jobs

Epic Radiant Testing Coordinator

Platform Lead

Product Data Specialist

User Support Specialist

Sr SDET

Applications Solutions Engineer