Many Data Scientists & Engineers usually face lots of issues when it comes to the DevOps side of their work, such as conflicting packages in their workstation, running out of CPUs, RAM, disk-space if not running in a suitable environment and generally not having a productive experience for building and maintaining their ETL pipelines or machine learning models.
This was the situation with our client as well - currently there are lots of platforms to address some of the issues that are mentioned above but they are not always the best options as they might be missing in particular factors especially if it is a tailored solution from the cloud leading providers(such as Azure Machine Learning or Amazon Sagemaker) - for example in this scenario you are purchasing a managed platform and you are locked in to the specific provider which limits general flexibility.
We have delivered a fuly scalable, feature-rich solution based on open-source packages on its core. By leveraging the power of Kubernetes a Terraform-scripted solution backed by CI/CD(built both in Azure Devops for Microsoft Clouds and CircleCI for others) to enable quick and accurate changes was achieved.
Data Scientists can now make use of a custom Experiments JupyterHub environment built on Kubernetes pre-loaded with custom Docker images with all the relevant Data Science packages from Python, R and Julia. Each image is designed to meet burstable demands according to the auto-scalable Kubernetes Cluster as well as maintaining state of their expirements, notebooks and home directories through Azure Disks or AWS EBS.
In addition to the above, SSL in-transit communication has been deployed with Let's Encrypt certificates. For authenticating users to the web application, we integrated with Azure AD for SSO with the client's Active Directory and for authorising users to data sitting in Azure Data Lake(gen2) we built Python and R frameworks to make use of access/refresh tokens as per the client's ACL custom policies.
Cloud DevOps Engineer
2019 — 2022
Colibri Digital