machine learning

So You Want to Become a Machine Learning Engineer/Data Scientist?

Apr 19, 2025 · 7 min read · machine learning data science artificial-intelligence ·

I'm a fresh graduate and I want to become a data scientist or machine learning engineer. Can you please give me some guidance? I've been working as a software engineer for the last two years, but now I want to switch to data science. Can you help or share some guidance? I've been applying to many data science jobs but …

Types of LLM Architectures

Mar 18, 2025 · 6 min read · machine learning data science artificial-intelligence nlp system-design ·

Share on:

Let's first break it down: what exactly are large language models (LLMs), why do we call them 'large,' and how are they different from other types of language models? An LLM is a machine learning model trained on massive amounts of text using transformer-based architectures (or their variations). These models can …

Building an MLOps Pipeline with Apache Airflow (Part 1)

Feb 17, 2023 · 7 min read · machine learning MLOps Airflow ·

Share on:

Building an MLOps Pipeline with Apache Airflow (Part 1)

Author: Sadman Kabir Soumik Let's first understand what's MLOps. What is MLOps? MLOps (Machine Learning Operations) is a set of practices and tools used to manage the entire lifecycle of machine learning models. MLOps includes everything from data preparation and model training to deployment, monitoring, and ongoing …

From RNN to Transformers (Without Math Jargon)

Jan 30, 2023 · 13 min read · machine learning data science NLP algorithm ·

Share on:

From RNN to Transformers (Without Math Jargon)

Transformer-based models are a types of neural network architecture that uses self-attention mechanisms to process input data. They were introduced in the paper "Attention Is All You Need" by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia …

How to Achieve Perfect Selfie Segmentation and Background Removal

Jan 16, 2023 · 6 min read · project-tutorial computer vision machine learning ·

Share on:

How to Achieve Perfect Selfie Segmentation and Background Removal

Project Goal The project aims to perform image segmentation on selfie images, see how we can blur the image's background, and even replace the background with some other solid colour like black. We will use a framework called MediaPipe to accomplish this task. About MediaPipe MediaPipe is an open-source framework …

Ace Your Data Science Interview - Top Questions With Answers

Nov 15, 2022 · 100 min read · deep learning machine learning data science ·

Share on:

Ace Your Data Science Interview - Top Questions With Answers

Can you explain the bias-variance trade-off and how it relates to model performance? Machine learning and statistics have a fundamental concept that requires balancing the model's bias and variance, known as the bias-variance trade-off. These two types of errors can affect a model's performance. Bias, the first type of …

Understanding Top 10 Classical Machine Learning Algorithms

Oct 26, 2022 · 42 min read · machine learning data science algorithms ·

Share on:

Understanding Top 10 Classical Machine Learning Algorithms

Before jumping into Deep Learning, one must know the classical/traditional Machine Learning algorithms, because understanding traditional machine learning algorithms can provide a strong foundation in machine learning concepts. These algorithms often involve simple, intuitive concepts that can be helpful in …

Machine Learning Model Compression Techniques - Reducing Size and Improving Performance

Oct 10, 2022 · 6 min read · mlops optimization machine learning data science ·

Share on:

Machine Learning Model Compression Techniques - Reducing Size and Improving Performance

There are 4 main approaches you can consider for model compression. They are: Quantization Pruning Knowledge Distillation Low-Rank Factorization Quantization Quantization is the most general and commonly used model compression method. Quantization reduces a model’s size by using fewer bits to represent its parameters. …

Multi-class Text Classification Using Apache Spark MLlib

May 24, 2022 · 7 min read · project-tutorial NLP spark machine learning ·

Share on:

Multi-class Text Classification Using Apache Spark MLlib

Spark MLlib MLlib is a library for machine learning in Spark that aims to make it easy to use and scalable for practical applications. It includes tools for common ML tasks, such as classification, regression, clustering, and collaborative filtering, as well as featurization methods for feature extraction, …

Keyphrase Extraction with BERT Embeddings and Part-Of-Speech Patterns

May 19, 2022 · 10 min read · project-tutorial programming NLP machine learning python ·

Share on:

Keyphrase Extraction with BERT Embeddings and Part-Of-Speech Patterns

Keyphrases are important pieces of information that can be extracted from text documents. These are words or phrases that summarize the main ideas or topics of a text, and they can be useful for a variety of applications, such as document summarization, text classification, and information retrieval. In this blog post, …

Understanding the Role of Data Normalization and Standardization in Machine Learning

Mar 12, 2022 · 2 min read · data science statistics machine learning ·

Share on:

Understanding the Role of Data Normalization and Standardization in Machine Learning

Why do we scale features? For machine learning, every dataset does not require feature scaling, and it is only needed when features have different ranges. For example, consider a data set containing two features, age(x1) and income(x2), where age ranges from 0–100, while income ranges from 0–20,000 and higher. Income …

One-Stage vs Two-Stage Instance Segmentation

Mar 4, 2022 · 5 min read · image segmentation machine learning computer vision data science ·

Share on:

One-Stage vs Two-Stage Instance Segmentation

In computer vision, image segmentation refers to the process of dividing an image into distinct regions or segments, each corresponding to a different object or background. There are two main approaches to image segmentation: one-stage and two-stage. One-stage image segmentation methods aim to directly predict a …

Machine Learning Practices - Research vs Production

Jan 10, 2022 · 5 min read · machine learning deep learning data science ·

Share on:

Machine Learning Practices - Research vs Production

There are several key differences between using machine learning for research and using it for production. One of the main differences is the focus of the work. Machine learning for research typically focuses on exploring new ideas and techniques, and on advancing the state of the art in the field. In contrast, machine …

Writing Machine Learning Model - PyTorch vs. TF-Keras

Dec 9, 2021 · 4 min read · machine learning deep learning data science ·

Share on:

Writing Machine Learning Model - PyTorch vs. TF-Keras

PyTorch and Keras are both open-source deep learning frameworks, but they have some significant differences. PyTorch is a low-level framework that allows you to define your own computation graphs, while Keras is a high-level framework that provides a pre-defined set of layers and routines for building deep learning …

GPT-3 by OpenAI - The Largest and Most Advanced Language Model Ever Created

Nov 20, 2021 · 5 min read · machine learning NLP data science ·

Share on:

GPT-3 by OpenAI - The Largest and Most Advanced Language Model Ever Created

Author: Sadman Kabir Soumik GPT-3, or Generative Pretrained Transformer 3, is a state-of-the-art language model developed by OpenAI. It has been trained on a massive amount of text data, including books, articles, and websites, to generate coherent and relevant text based on a given context. GPT-3 is a …

Vanishing Gradient Problem and How to Fix it

Oct 24, 2021 · 3 min read · deep learning machine learning data science ·

Share on:

Vanishing Gradient Problem and How to Fix it

What is Vanishing Gradient Problem Neural networks are trained using stochastic gradient descent. This involves first calculating the prediction error made by the model and using the error to estimate a gradient used to update each weight in the network so that less error is made next time. This error gradient is …

Ensemble Techniques in Machine Learning - A Practical Guide to Bagging, Boosting, Stacking, Blending, and Bayesian Model Averaging

Jun 10, 2021 · 11 min read · machine learning data science ·

Share on:

Ensemble Techniques in Machine Learning - A Practical Guide to Bagging, Boosting, Stacking, Blending, and Bayesian Model Averaging

There are several types of ensemble techniques in machine learning, including: Bagging, Boosting, Stacking, Blending, Bootstrapped ensembles, Bayesian model averaging. Bagging Bagging (short for bootstrapped aggregating) is an ensemble technique that involves training multiple models on different subsets of the …

Understanding the Differences between Decision Tree, Random Forest, and Gradient Boosting

Mar 27, 2021 · 4 min read · machine learning algorithm data science ·

Share on:

Understanding the Differences between Decision Tree, Random Forest, and Gradient Boosting

Decision Tree, Random Forest (RF), and Gradient Boosting (GB) are three popular algorithms used for supervised learning tasks such as classification and regression. In this blog, we will compare these three algorithms in terms of their features, performance, and usability. Decision Tree is a simple and intuitive …

Different Word Embedding Techniques for Text Analysis

Dec 11, 2020 · 7 min read · machine learning NLP deep learning ·

Share on:

Different Word Embedding Techniques for Text Analysis

Word embedding is a technique in natural language processing (NLP) where words are represented as vectors of real numbers. This allows words with similar meanings to have similar representation, and can be used in various NLP tasks such as machine translation and text classification. There are several different …

How A Recurrent Neural Network Works

Oct 25, 2020 · 8 min read · deep learning machine learning NLP algorithm ·

Share on:

Recurrent Neural Network A recurrent neural network (RNN), is a type of neural network that can process sequential data, like text, audio, or time series data. Here's how it works: first, the RNN takes in some input data, which could be a word in a sentence, a sound wave from an audio recording, or a measurement from a …