Rock Drip

Jobs (78014)

NLP Expert for Frequency Word Definitions

Upwork

Description:

We are seeking an experienced Machine Learning (ML) expert to assist in preparing a dataset of the 100k most common English words.

The goal is to compile, structure, and process a comprehensive set of metadata for each word entry, including pronunciation, part of speech, definitions, synonyms, usage examples, and more.

Responsibilities:

Dataset Compilation: Extract and compile a list of the 50k, 100k, 200k, most common English words, ensuring that all entries are lemmatized (i.e., in their base or dictionary form).

Metadata Collection: Develop scripts or use APIs to gather relevant metadata for each word, including:

Pronunciation (preferably in a standard dictionary format).

Part of speech.

Concise definitions.

Example sentences.

Synonyms and antonyms.

Etymology (optional).

Word frequency data.

- Note that this metadata must be commercially usable

Requirements:

Expertise in Machine Learning and Data Science: Proven experience in data extraction, processing, and analysis.

Familiarity with Linguistic Data: Experience working with linguistic datasets, corpora, or dictionary projects is a strong plus.

Programming Skills: Proficiency in Python or similar programming languages, with experience using libraries such as NLTK, spaCy, or similar for natural language processing (NLP).

API Experience: Experience working with APIs like Wiktionary, WordNet, or other linguistic databases.

Attention to Detail: Strong focus on data quality and accuracy.

Communication: Ability to clearly communicate progress, challenges, and results.

Deliverables:

- A structured dataset of say 100k most common english English lemmatized words to start, with complete metadata.

Scripts or tools used to gather and process the data, with clear documentation.

Location: Anywhere

Posted: Sept. 1, 2024, 7:25 a.m.

Apply Now Company Website

Job Listings

Jobs (78014)

NLP Data Scientist : Onsite Role

NLP Data Scientist - Full time only

NLP Expert for Frequency Word Definitions

Senior NLP Software Engineer I - Java

Partner Solutions Manager - Gaming

Games Operator

Chief Interactive Gaming Officer

Game Operator

Staff Game Engineer

Game Tester - Mistplay - Paterson - United States - DoScouting

Director of Sales, Advertising Solutions (Quests) - East

Game Day Operations Internship

Video Game Developer (Part-Time) at Ops Tech Alliance Remote

Spanish Linguist (work from hom)

Assistant to Associate Professor TenureTrack Linguistics

Linguistic Analyst

ASL Computational Linguist

Data Engineer

ETL Developer

ETL Developer

NLP Expert for Frequency Word Definitions

Upwork