Principal Machine Learning Engineer, ML Training Platform
We're looking for a Principal Machine Learning, ML Training Platform to join Snap Inc!
Ensure all your application information is up to date and in order before applying for this opportunity.
What you’ll do :
• Design, implement, and scale critical machine learning components and services to support Snap's most strategic initiatives
• Build a next-generation training framework that can support large-scale model training, enabling us to push the limits of what's possible with machine learning
• Perform training and model performance optimization with various GPUs to improve model training speed and efficiency
• Develop an AutoML platform to accelerate model generation and automate the machine learning model lifecycle
• Work across teams to understand product requirements, evaluate trade-offs, and deliver the solutions needed to build innovative products or services
• Advocate for and apply best practices when it comes to availability, scalability, operational excellence, and cost management
• Provide technical direction that influences the entire company
Knowledge, Skills & Abilities :
• Strong understanding of machine learning approaches and algorithms
• Excellent programming and software design skills, including debugging, performance analysis, and test design
• Proven track record of operating highly-available systems at scale
• Ability to proactively learn new concepts and technology and apply them at work
• Skilled at solving ambiguous problems
• Strong collaboration and mentorship skills
Minimum Qualifications :
• BS in technical field such as computer science, mathematics, statistics or equivalent years of experience
• 14+ years of industry machine learning experience
• Experience with GPU / TPU training and optimizations
Preferred Qualifications :
• Masters / PhD in a technical field such as computer science
• Experience leading teams and driving technical roadmaps
• Experience working with machine learning, recommendation and ranking systems, or vector similarity search
• Experience with TensorFlow, PyTorch, or related deep learning frameworks
• Experience with Docker, Kubernetes, Ray, NoSQL solutions, Memcache / Redis, Google / AWS services
J-18808-Ljbffr
Last updated : 2024-08-20
Location: Los Angeles, CA
Posted: Aug. 23, 2024, 8:36 a.m.
Apply Now Company Website