Acest post este localizat în Cluj-Napoca, Romania.
Căutăm un MLOps Engineer.
Oferim un post full-time.
Informații suplimentare
You might be our missing piece if you have:
- Strong expertise in Python and AI frameworks such as PyTorch, Keras, SciPy, or Tensorflow.
- Experience with Python-based Web frameworks like FastAPI, Flask, or Django.
- Knowledge of PEP 8 coding standards for Python.
- Extensive experience in solving AI/ML challenges and working with LLMs.
- Familiarity with OpenAI, Embeddings, Completion, and Semantic Search.
- Solid experience with API integrations and working with external APIs like OpenAI, Anthropic, or similar AI service providers.
- Hands-on experience with containerization and orchestration tools – especially Docker for packaging ML models, and Kubernetes (or similar) for deploying and scaling them in distributed environments.
- Proficiency in DevOps and automation practices: designing CI/CD pipelines (using tools like Jenkins, GitLab CI/CD, or GitHub Actions) to automate model testing and deployment, and using Infrastructure-as-Code (CloudFormation, Terraform) to manage cloud resources.
- Working knowledge of cloud computing services (AWS, Azure, GCP) for ML workloads. This includes familiarity with cloud AI/ML services and managed ML platforms (like SageMaker, Azure ML, or GCP AI Platform) and experience setting up scalable infrastructure for data and models (compute instances, storage, networking for model endpoints).
- Familiarity with databases and experience using SQLAlchemy, Alembic, and database management for AI models.
- Strong skills in managing datasets using tools like Pandas, SciPy, and Numpy for data pre/post-processing.
- Experience with monitoring and logging frameworks to track running systems; Prometheus/Grafana or cloud monitoring services to record model serving performance metrics, and possibly specialized ML monitoring solutions (e.g. MLflow, Weights & Biases, Apache Airflow for scheduling retraining).
- Strong analytical and problem-solving skills to diagnose issues from logs/metrics and tune system performance.
- Excellent communication skills and a collaborative mindset; Since this role works across AI Engineering, Data Engineering, DevOps Engineering, and client teams, the engineer must be able to explain technical concepts to diverse stakeholders and document work clearly.
- Ability to work in an agile environment, manage priorities, and coordinate with remote or cross-functional team members is important.
We would be thrilled if you have:
- A track record of deploying and managing machine learning models at scale (e.g., in a product or platform used by thousands of end-users or clients).
- Experience working on client-facing projects or consulting engagements.
We will be working together on:
- Designing, building, and automating ML pipelines.
- Deploying and scaling models in production.
- Monitoring, maintaining, and improving model performance.
- Collaborating with Data Engineers and client stakeholders.
- Establishing governance, documentation, and best practices.