Mistral AI is hiring an expert in the role of serving and training large language models at high speed on GPUs. The role can be based in Paris, London, or in San Francisco. The role will involve - Writing low-level code to take all advantage of high-end GPUs (H100) and max out their capacity - Rethinking various part of the generative model architecture to make them more suitable for efficient inference - Integrating low-level efficient code in a high-level MLOps framework The successful candid…
Location: London, UK
Posted: March 7, 2025, 1:46 a.m.
Apply Now Company Website