Senior MLOps Engineer
Tala
Mexico City, Mexico
Posted on Oct 18, 2024
About Tala
Tala is on a mission to unleash the economic power of the Global Majority – the 4 billion people overlooked by existing financial systems. With nearly half a billion dollars raised from equity and debt, we are serving millions of customers across three continents. Tala has been named by the Fortune Impact 20 list, CNBC’s Disruptor 50 five years in a row, CNBC’s World's Top Fintech Company, Forbes’ Fintech 50 list for eight years running, and Chief's The New Era of Leadership Award. We are expanding across product offerings, countries and crypto and are looking for people who have an entrepreneurial spirit and are passionate about our mission.
By creating a unique platform that enables lending and other financial services around the globe, people in emerging markets are able to start and expand small businesses, manage day-to-day needs, and pursue their financial goals with confidence. Currently, over nine million people across Kenya, the Philippines, Mexico, and India have used Tala products. Due to our global team, we have a remote-first approach, and also have offices in Santa Monica, CA (HQ); Nairobi, Kenya; Mexico City, Mexico; Manila, the Philippines; and Bangalore, India.
Most Talazens join us because they connect with our mission. If you are energized by the impact you can make at Tala, we’d love to hear from you!
The Role
We are currently seeking a Senior Cloud Infrastructure Engineer with experience in MLOPs to design, implement, and maintain suitable infrastructure and best deployment practices of ML Pipelines and models. You will bring Machine Learning, AI infrastructure, and automation expertise with the knowledge of AWS cloud infrastructure and DevOps practice
What You'll Do
- Design, build, and maintain scalable and robust infrastructure for AI/ML (Artificial Intelligence / Machine Learning) systems, including cloud-based environments, containerization, and orchestration platforms
- Develop and implement CI/CD pipelines to automate the deployment, testing, and monitoring of AI/ML models and applications
- Evaluate and integrate new tools, technologies, and frameworks to improve the efficiency and effectiveness of our MLOps processes
- Design and manage Continuous deployment using Kubernetes, ArgoCD, and Jenkins
- Maintain related container registry and model registry.
- Monitor infrastructure utilization and costs pertaining to model training, inference, and GPU utilization
- Monitor and troubleshoot AI/ML systems to ensure high availability, performance, and reliability
What You'll Need
- 4+ years of experience as a DevOps Engineer
- 1 year of previous experience managing AI/ML infrastructure in public cloud environments
- In-depth hands-on experience with at least one public cloud platform, preferably AWS
- Experience with Python or any other programming language
- Experience with Docker and Kubernetes in production
- Experience with Continuous Deployment tools such as Jenkins or ArgoCD
- Experience with Logging and Monitoring tools for SaaS such as Sumo, Splunk, Datadog, etc
- Proficiency in English
Our vision is to build a new financial ecosystem where everyone can participate on equal footing and access the tools they need to be financially healthy. We strongly believe that inclusion fosters innovation and we’re proud to have a diverse global team that represents a multitude of backgrounds, cultures, and experience. We hire talented people regardless of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status.