mirekphd

1,539
reputation
12
22

In my data scientist hat I work in python on creating ML modeling pipelines, from data munging, feature engineering and selection, model training, distributed hyperparameters tuning, performance optimization, reproducibility, to validation and monitoring of post-production model performance.

In my ML Engineer / MLOps hat I develop in-house Docker containers (allowing for self-service package installations and automated updates) for data scientists working on ML models development (GPU-enabled, python, R, H2O) with familiar interfaces such as Jupyter, RStudio Server, and VS Code, specialized ML Ops frameworks such as ML Flow or generic data and file management tools (such as MinIO, Cloud Commander, or SQL/no-SQL databases). I also develop REST APIs for the production deployment of these ML models and their features (using python (for GBDTs) or Java (for H2O models), Flask, gunicorn, Redis, MinIO, git and bash).

In my DevOps hat I orchestrate the two types of ML containers (stateful for ML models development and stateless for their production deployment) in several Openshift clusters, automating their builds, packages/libraries/extensions updates, security scans, and staging/production deployments using Jenkins pipelines, bash, python, and Groovy scripts and Openshift build/deployment configs, all integrated using webhooks. I also perform linux system admin role for the CI/CD build server (CentOS, docker, docker-compose, MicroK8s, Jenkins, Clair, Postgres) and fulfill an Openshift business admin role (using Openshift CLI, YAML configs and bash scripts) in both the data science development and in ML models production clusters.