I’ll do a side-by-side comparison of architectural patterns for the Data Pipeline and Machine Learning Pipeline and illustrate principal differences. My main goal is to show the value of deploying dedicated tools and platforms for Machine Learning, such as Kubeflow and Metaflow. Utilizing non-Machine Learning tools, be it Airflow for the orchestration, appears to be suboptimal.

Machine learning pipeline products have been quite popular for a lot of time now. But why do we need pipelines in the first place? Here are some of the main reasons, in my opinion:

Why we Need Pipelines

  • Using pipelines helps you maintain discipline in work. You are…

Although Katib is a Kubeflow built-in Hyperparameter Search (HS), here is why I choose Keras-Tuner for distributed HS:

  • Keeps codebase independent of Kubeflow
  • Keeps Kubeflow experiments independent of the codebase decisions
  • Keras(TensorFlow)-trained models are also HS-optimized with Keras
  • Keeps HS to be a part of the codebase with source code versioning and CI/CD
  • Prevents running hundreds of Kubeflow experiments for a single HS

HS’s goal is to produce a list of model architectures and model parameters sorted by its model accuracy score. Asynchronous HS will produce such a list progressively by having it updated with newly computed trials. Keras Tuner…

In this blog post, I’d like to share some of the insights from my work at the High-Performance Computing (HPC) Texas Advanced Computing Center (TACC) cluster (circa 2010, TACC had a “Lonestar” cluster with 5200 2Gb-nodes), and within the Technology Group at Conoco-Phillips Corp. Specifically, I want to address the issue of Data vs. Model distributed computations.

But before I go into the details, let’s understand what ring-reduce and all-reduce are.

What Are Ring-Reduce And All-Reduce Operations?

All-Reduce is a parallel algorithm that aggregates the target arrays from all processes independently into a single array. Aggregation can be either concatenation or summation, or any other operation…

When you sit down to think about it: architects experiment and build models in AutoCAD, Geophysicists experiment and build earth models in Schlumberger. But what do Data Scientists have?

I introduce to you the amazing Kubeflow! The platform’s state-of-the-art features, matched with its plethora of tools, means deploying all your machine learning workflows and models on a Data Science Cluster jam-packed with functionalities (such as reproducible experimentation and Kubernetes orchestration).

But that’s where the question pops: do you know how to migrate to Kubeflow? No? Well, worry, not! With this article, we aim to teach you exactly that. Curious…

Both Kubeflow (2018, Google) and Metaflow (2019, Netflix) are great Machine Learning platforms for experimentation, development, and production deployment. Having used both of these, here is my comparative analysis.

For starters, Kubeflow is a project that helps you deploy machine learning workflows on Kubernetes. On the other hand, Metaflow is a Python library that helps data scientists build and manage real-life data science projects. Both Kubeflow and Metaflow are developed to boost the productivity of data scientists by facilitating them with state-of-the-art machine learning.

I would recommend watching these videos to illustrate the purpose of Machine Learning Platform:

3-min: https://youtu.be/sdbBcPuvw40

Roman Kazinnik

Always seeking opportunities and challenges to continue developing as a scientist and technical leader.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store