Job Listings

Lead Data Scientist - NLP and Gen AI

Laksan Technologies

About the Lead Data Scientist (NLP & GenAI) Job Role

You will help our clients solve real-world problems by tracing the data-to-insights lifecycle:
• Understand business problems, making sense of the data landscape & footprint, performing a combination of Gen AI , Advanced NLP, exploratory analysis
• Create, experiment with and deliver innovative solutions in a consultative mindset to client stakeholders using textual and all relevant data points
• Guide team of data scientists to offer exceptional solutions to clients, across domains.

Work Location: Oklahoma City,OK

Qualification And Experience
• Academic background in Computer Science / Computer Applications or any quantitative discipline (Statistics, Mathematics, Economics/Operations Research etc.) from a reputed institute.
• 5-8 years of experience using analytical tools / languages like Python on large-scale data
• 3+ projects experience in large scale deployment of AI solutions using text data with machine learning, deep learning, transfer learning and generative AI tech stack
• Excellence hands-on about generative AI capabilities like usage of LLMs (Large Language Models), Embedding Models from commercial and open-source space
• Ability to refine prompts for improvements in the outcomes and usage of frameworks from like langchain and llma-index
• Ability to come-out with level of generative ai solution from prompt engineering, retrieval augmented generation (RAG), agentic workflows or LLM instruction tuning or fine-tuning
• Must have experience in various text intelligence use cases like semantic modelling, NER, summarization and topics identification
• Basic level of experience in web scraping, crawling, parsing and business intelligence
• Experience working with open-source pre-trained models, awareness of state-of-art in embeddings and applicability for use cases largely on text data
• Must have strong experience in NLP/NLG/NLU applications using Deep learning frameworks like PyTorch, Tensor Flow, BERT, GPT (or similar models)
• Demonstrated ability to engage with client stakeholders at multiple levels and provide consultative solutions across different domains
• Deep knowledge of techniques such as Linear Regression, gradient descent, Logistic Regression, Forecasting, Cluster analysis, Decision trees, Linear Optimization, Text Mining
• Strong understanding of integrating NLP models into business workflows. Prospect should have exposure to project initiation to business impact creation in at least one project.
• Strong consultative experience in understanding the business process, drafting concise problem statement(s), articulating as-is and to-be process, research and experimenting solution alternatives, creating performance KPIs and delivering best possible alternative solutions by being an active team members while collaborating with other data scientists, data engineers, and relevant stakeholders

Experience In Productionizing & Retraining Models
• Ability to guide and mentor teams of associates on solution development and approaches
• Broad knowledge of fundamentals and state-of-the-art in NLP and machine learning
• Coding skills in one or more programming languages such as Python, SQL
• Expert / high level of understanding on language semantic concepts & data standardization
• Proven track record of successful models and practical implementation
• Experience in training transformer-based language models and their variants (T5, BART, BERT etc)
• Knowledge of transformer architecture and the impacts of modifying the same
• Familiar with multiple evaluation metrics fore LLMs
• Experience with Huggingface, Langchain etc., building the pipelines
• Experience with Vector DBs, Text embedding models
• Different prompting templates Zero-shot, Few-shot, Composition etc.
• LLM In-context learning, Fine tuning, Model evaluation metrics etc.
• Text pre/post – processing techniques
• Experience in using GPUs to train deep learning models
• Good knowledge of solving industrial problems using deep learning models with NLP-related use-cases
• Familiar with all prompting techniques
• Hands-on experience with popular ML frameworks such as Pytorch (must), TensorFlow
• Experience with Production deployment of LLM solutions using models from top players like Openai, Meta, Microsoft and HuggingFace
• Building scalable LLM solution and demonstration using Flask, Gradio or any other APIs
• Familiarity with any Cloud services such as Azure ML studio, AWS Sage Maker etc. is considered a plus
• Knowledge in Machine Learning techniques in entity resolution, common speech products or text search domain

Location: Oklahoma City, OK

Posted: Aug. 21, 2024, 9:54 a.m.

Apply Now Company Website