ML Ops Engineer

ML Ops Engineer

Brief about the role : Candidate is expected to productionalize & deploy the model code (base code will be developed by Data Science). This role will have to spend good amount of time in Scaling the pre-developed model training and inference (multi-GPU/CPU).  We expect the candidate to have Strong Python/SQL , Kubernetes, Scaled Model deployment and ML experience.

 

About the Role:

We are seeking a highly skilled MLOps Engineer to play a pivotal role in bringing cutting-edge AI models to production. You will collaborate closely with our Data Science and Machine learning teams to optimize, scale, and deploy AI/ML models.

Responsibilities:

  • Model Productionization and Deployment:
    • Translate complex machine learning models into robust and scalable production systems.
    • Deploy models to production environments using Kubernetes or other container orchestration tools.
    • Ensure seamless integration of models with existing infrastructure and applications.
  • Performance Optimization:
    • Identify and implement strategies to optimize model training and inference performance.
    • Leverage techniques like GPU acceleration, distributed training, and model quantization to improve efficiency.
    • Monitor model performance in production and proactively address any performance bottlenecks.
  • Scalability Engineering:
    • Design and implement scalable solutions for handling large-scale data and model workloads.
    • Optimize data pipelines and model serving infrastructure to meet growing demands.
    • Collaborate with infrastructure teams to ensure adequate resources and capacity.
  • ML Operations (MLOps):
    • Establish and maintain robust MLOps practices to streamline model development, deployment, and monitoring.
    • Implement automated pipelines for model training, testing, and deployment.
    • Monitor model performance in production and take corrective actions as needed.

Qualifications:

5+ years of experience

  • Strong proficiency in Python programming language and SQL.
  • In-depth knowledge of machine learning frameworks (e.g., TensorFlow, PyTorch, Scikit-learn).
  • Expertise in Kubernetes or other container orchestration tools. – Good to have
  • Deploying Code & Optimizing
  • Experience with MLOps tools and frameworks (e.g., MLflow, Kubeflow).
  • Version Controlling – Azure blob storage
  • Experience with cloud platforms (e.g., AWS, GCP, Azure). – Azure is preferred
  • Solid understanding of distributed computing and parallel programming.
  • Strong problem-solving and analytical skills.
  • Excellent communication and collaboration abilities.

Preferred Qualifications: 1   Good to have

  • Knowledge of big data technologies (e.g., Hadoop, Spark).
  • Experience with model optimization techniques (e.g., pruning, quantization, distillation).

 

Year of Experience Required: 5-10 Years
No of Opening: 4

Apply for this position

Allowed Type(s): .pdf, .doc, .docx