Job Listings

Research Scientist - NLP | Cambridge, MA, USA

S&P Global

Research Scientist - NLP

About the Role:
Grade Level (for internal use):
03Are you looking to solve hard problems and enjoy working with teammates with diverse perspectives? If so, we would love to help you excel here at Kensho. We are a collaborative group of experienced Research Scientists and Machine Learning Engineers, whose academic backgrounds include doctorate degrees in NLP, theoretical physics, statistics, etc. We take pride in our team-based, tightly-knit startup-like Kenshin community, which fosters continuous learning and a communicative environment.

At Kensho, we hire talented people and give them the freedom, support, and resources needed to accomplish our shared goals. We believe in flexibility-first and give our employees the opportunity to work from where they feel most productive and engaged (must be in the United States). We also value in-person collaboration, so there may be times when travel to one of our Kensho hubs (e.g., Cambridge, MA or NYC) will be required for team meetings or company events.

About the R&D Lab:
Since 2022, we have been building a world-class R&D lab comprised of NLP Research Scientists, and we heavily prioritize publishing in top-tier conferences. Our small team has demonstrated compelling results and is fueling innovation throughout Kensho and S&P Global at large. Specifically, we are continuously developing Large Language Models (LLMs) and are actively working on long-context question-answering (QA), complex reasoning, tokenization, alignment (e.g., factuality), multi-document QA, and more!

Our small team has reserved access to hundreds of fast GPUs (A100s), spanning Cloud and on-prem machines.

Our current projects include:
- Long-context document QA, where the answer is contained within documents that are hundreds of pages in length
- Complex reasoning, including better understanding and improving models' ability to approximate numbers (related to commonsense reasoning).
- Creating rigorous evaluation benchmarks, spanning domain knowledge, quantity extraction, and program synthesis
- Improving existing alignment techniques for domain-specific needs, while also addressing factuality
- Dissecting tokenizers to better understand how each of the sub-components impact intrinsic and extrinsic performance
- Multi-Document QA where the answer requires combining information from dozens of sources.
- Retrieval-augmented generation ( RAG) methods
- Creating high-quality data filters for LLM development

Additionally, we maintain strong relationships with academia, including collaborating on several ongoing projects, providing industry grants, sponsoring conferences, and jointly holding faculty positions.

Kensho states that the anticipated base salary range for the position is 150k-225k. In addition, this role is eligible for an annual incentive bonus and equity plans. At Kensho, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case.

Technologies & Tools We Use:
• ML: PyTorch, Weights & Biases, NetworkX
• Deployment: Airflow, Docker, EC2, Kubernetes, AWS
• Datastores: Postgres, Elasticsearch, S3 What You'll Do:
• Regularly reading late-breaking research papers and helping to identify pertinent directions of work
• Developing novel, state-of-the-art NLP models that can scale to millions of documents
• Working closely with other Research Scientists and ML Engineers
• Writing clean, readable research code in PyTorch (not expected to write production-level code)
• Contribute to a stellar engineering culture that values excellent design, documentation, testing, and code
• Share your research results with your colleagues (presentations) and the world (published papers, patents, and blog posts) What You'll Need:
• Outstanding people come from all different backgrounds, and we're always interested in meeting talented people! Therefore, we do not require any particular credential or experience. If our work seems exciting to you, and you feel that you could excel in this position, we'd love to hear from you. That said, most of our successful candidates possess the following, which reflects both our technical needs and team culture:
• Hold a PhD in Computer Science or related field (or a Master's with significant research experience)
• Have published in a top-tier ML/NLP conference (e.g., ACL, NAACL, EMNLP, NeurIPS, ICML)
• Are proficient in writing code in PyTorch, Tensorflow, or JAX
• Have experience with the techniques required to work effectively with large, messy real-world data
• Prefer to collaborate iteratively on hard problems with your teammates rather than spending stretches of time working alone and presenting your results intermittently
• Have a love for learning new skills and domains
• Are excited to share knowledge freely, proactively, and effectively with others who are interested
• Are a generous teammate who takes work seriously witho

Location: Cambridge, MA

Posted: Oct. 8, 2024, 7:52 p.m.

Apply Now Company Website