Job Listings

AI/ML Ops / Software Engineer (Tech Lead)

Modern Treasury

OVERVIEW

This position can be based out of San Francisco, New York, or remote (we accept candidates from many states).

As the AI/ML Ops / Software Engineer (Tech Lead), you will lead the architecture, development, and operations of AI/ML infrastructure, ensuring that state-of-the-art models are seamlessly integrated into production environments. In collaboration with AI Product Researchers, you will develop and deploy scalable machine learning models, focusing on automating and optimizing the ML Software Development Lifecycle (ML SDLC) to streamline everything from model research and experimentation to serving and monitoring in production.

This role requires a blend of technical excellence, AI/ML operations experience, and leadership capabilities to guide cross-functional teams through the challenges of modern AI model deployment. You will be responsible for building AI/ML Ops practices and frameworks to support the rapid and reliable transition of models from research into full-scale production environments.

ABOUT MODERN TREASURY

Modern Treasury is the operating system for money movement. Our payment operations platform combines a suite of APIs and dashboards to help companies unlock new payments revenue, strengthen customer experiences, and drive efficiency through their business. Our end-to-end platform moves enterprises forward with faster payments, efficient workflows, full data visibility, and seamless bank integrations.

KEY RESPONSIBILITIES
• End-to-End AI/ML Lifecycle Management:
• Lead and optimize the ML SDLC, ensuring a smooth transition from model research to deployment, integrating CI/CD pipelines, monitoring, and scaling.
• Oversee the full lifecycle of AI systems, from model development and experimentation to testing, deployment, and continuous optimization.
• Collaboration with AI Product Researchers:
• Collaborate closely with AI Product Researchers to translate research outcomes into scalable, production-ready AI models.
• Build amplifying harnesses and tools to improve research-to-production cycles, allowing researchers to rapidly iterate on models.
• AI/ML Infrastructure and Operations:
• Architect and manage robust AI/ML Ops frameworks that handle the training, serving, monitoring, and lifecycle management of models, enabling fast iteration and high reliability.
• Establish best practices for AI/ML infrastructure, ensuring operational excellence in production environments.
• Build Scalable and Distributed Systems:
• Lead the design of scalable architectures that support both real-time inference and large-scale model training.
• Optimize performance and scalability across distributed AI systems, leveraging cloud technologies (e.g., AWS SageMaker, EKS, and distributed computing frameworks).
• Drive Innovation in AI Model Deployment:
• Lead innovation initiatives, pushing the envelope on AI-driven systems such as LLMs, transformers, and recommendation engines.
• Enable MLOps best practices, incorporating automation for model versioning, rollback, and monitoring to ensure robustness and compliance in production environments.
• Cross-functional Leadership:
• Partner with engineering, product, and data teams to ensure that AI solutions are fully integrated into product features, driving measurable business outcomes.
• Mentor and lead engineering teams, establishing a strong culture of technical excellence, collaboration, and innovation.
• Research into Production:
• Facilitate the seamless integration of research-grade AI models into production, ensuring performance, scalability, and alignment with business requirements.
• Develop tools and infrastructure that empower researchers to experiment with and deploy cutting-edge models faster.

QUALIFICATIONS
• 10+ years of experience in AI engineering or machine learning infrastructure, with significant experience leading end-to-end AI model development and deployment in production environments.
• Expertise in AI/ML Ops and ML SDLC:
• Strong experience with modern MLOps frameworks and practices, including automated model deployment, monitoring, and lifecycle management.
• Deep understanding of the ML SDLC, with hands-on experience building pipelines for scalable model serving, retraining, and testing.
• Collaborative AI Research to Production Experience:
• Proven ability to work closely with AI Product Researchers to transform cutting-edge research into scalable, real-world AI solutions.
• Experience building tools and amplifying harnesses that streamline the transition from research models to full-scale deployment.
• Expert in Modern AI Architectures:
• Deep expertise in LLMs, transformers, and advanced recommendation systems, with hands-on experience in building and optimizing models using open-source frameworks such as PyTorch and TensorFlow.
• Proficiency in optimizing and deploying AI models on cloud platforms, especially AWS, leveraging tools like SageMaker, EC2, and Kubernetes.
• Leadership and Technical Expertise:
• Proven leadership

Location: Anywhere

Posted: Oct. 14, 2024, 8:32 a.m.

Apply Now Company Website