By the end of this post we will learn the following components of Active Machine Learning:

  1. Blocking
  2. Iterative labeling:
  • Select a small number of data points for manual labeling
  • Stopping criteria
  • Update Training data and Classifier

3. Run Classifier for all blocks

Deduplication with Active Learning includes (1) training a…

There are two main considerations when it comes to adopting TFX: value and cost. I want to demonstrate the value of TFX and how it helps with production-level experimentation, and adopting the best standards for model and data validation.

The cost for using TFX is all about locking up Machine…

I’ll do a side-by-side comparison of architectural patterns for the Data Pipeline and Machine Learning Pipeline and illustrate principal differences. My main goal is to show the value of deploying dedicated tools and platforms for Machine Learning, such as Kubeflow and Metaflow. …

Although Katib is a Kubeflow built-in Hyperparameter Search (HS), here is why I choose Keras-Tuner for distributed HS:

  • Keeps codebase independent of Kubeflow
  • Keeps Kubeflow experiments independent of the codebase decisions
  • Keras(TensorFlow)-trained models are also HS-optimized with Keras
  • Keeps HS to be a part of the codebase with source code…

In this blog post, I’d like to share some of the insights from my work at the High-Performance Computing (HPC) Texas Advanced Computing Center (TACC) cluster (circa 2010, TACC had a “Lonestar” cluster with 5200 2Gb-nodes), and within the Technology Group at Conoco-Phillips Corp. Specifically, I want to address the…

When you sit down to think about it: architects experiment and build models in AutoCAD, Geophysicists experiment and build earth models in Schlumberger. But what do Data Scientists have?

I introduce to you the amazing Kubeflow! The platform’s state-of-the-art features, matched with its plethora of tools, means deploying all…

Both Kubeflow (2018, Google) and Metaflow (2019, Netflix) are great Machine Learning platforms for experimentation, development, and production deployment. Having used both of these, here is my comparative analysis.

For starters, Kubeflow is a project that helps you deploy machine learning workflows on Kubernetes. On the other hand, Metaflow is…

I will talk about convolutional neural networks and how you can optimize their architecture while eliminating redundancy. Let’s get started.

A convolutional neural network, also known as CNN, is a deep learning algorithm that takes in an image as an input and weighs the varied objects in the image…

Roman Kazinnik

Always seeking opportunities and challenges to continue developing as a scientist and technical leader.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store