Skip to content

Segmentation: definition & types

“Do you know what segmentation is?” Jon asked the magician. The magician replied: “It’s the process of dividing an image into meaningful regions” “like giving every pixel a role. So, you know the pixels that… 

YOLO

“The YOLO-scope! It stands for ‘You Only Look Once.” He aimed the YOLO-scope at a passing dragon. A neat little box appeared on the viewing screen around the creature, with a label that read: ‘Dragon… 

Fast R-CNN

Blaze was fascinated by the tiny details of the world below. He dreamt of a way to instantly recognize every flower, every rock, and every scurrying critter in the meadow when he flies. One day,… 

What’s LangGraph

LangGraph is a powerful, open-source framework for building and managing complex, stateful, and long-running AI agents. It provides a flexible and controllable way to create sophisticated AI workflows by representing them as graphs. At its… 

Perceptual loss

Perceptual loss is a type of loss function used in AI, especially for tasks like creating or changing images. Instead of comparing two images pixel by pixel, it measures the difference between them based on… 

Ollama models that can be run on a laptop

Running large language models locally on a laptop is becoming increasingly feasible, and Ollama makes it accessible. The key to a good experience is choosing a model that matches your laptop’s hardware, primarily its RAM… 

pywin32

pywin32 lets your Python scripts directly control the Windows operating system and its applications. It acts as a bridge, giving you access to the vast Windows Application Programming Interface (API) from within Python. 🤖 Think… 

RAG

RAG stands for Retrieval-Augmented Generation. It’s a powerful technique used in artificial intelligence to make Large Language Models (LLMs) like me more accurate, up-to-date, and trustworthy. The Simple Analogy: An “Open-Book Exam” Think of a… 

selective focus photography of brown camel

RAG with EmbeddingGemma with Python Code using Ollama

Retrieval-Augmented Generation (RAG) is a powerful technique that enhances the capabilities of Large Language Models (LLMs) by connecting them to external knowledge sources. According to Google Developer website, EmbeddingGemma is a compact, open‑source embedding model… 

AdamW optimization and implementation in PyTorch

The AdamW method was proposed in the paper “Decoupled Weight Decay Regularization” by Ilya Loshchilov and Frank Hutter. While the paper was officially published at the prestigious International Conference on Learning Representations (ICLR) in 2019,… 

Types of Pooling operations

“You said pooling operations in Convolutional Neural Networks (CNNs) are like the magical zoom-out buttons.” “They reduce the size of feature maps while keeping the juicy bits of information. But how?” Peter asked. “There are… 

Why CNNs are so effective

“Professor, why is CNN so effective?” “CNNs don’t just look at the whole image like a confused tourist—they zoom in on tiny patches (called kernels) and analyze them like Sherlock Holmes inspecting clues.” “Ok. This… 

The CNN Workflow

“What’s CNN workflow?” Alex asked. Peter replied, “If we have an input image represented as a tensor, like a 32×32 pixel image with 3 color channels (Red, Green, Blue) would have a shape of 32x32x3.”… 

ResNet – Residual Network

I’m building a super tall tower out of Lego blocks. Each block is a layer in a neural network. The taller the tower, the more complex patterns it can learn. But the problem is “Tall… 

AlexNet: The CNN That Changed Everything

“Hey Alex, do you know what AlexNet is?” The little spirit asked Alex. “AlexNet is a game changer. Many years ago, everyone were using basic machine learning models to recognize images — and they were… 

VGGNet in the Magic Canvas

“Wow. So this is the canvas that can do image classification and object detection?” Vixel asked. “Yes, I am VGG. VGG stands for Visual Geometry Group.” the Canvas replied. “More exactly, I’m VGG19, which means… 

Object Detection

“I love sortering, especially beautiful mushrooms like this.” Jon thought “But I heard something on object detection trying to micmic human ability. It combines object localization to create bounding boxes around each object and then… 

Convolution & Filtering

“The world was a dazzling mosaic of colors and shapes, but I wonder if the magical computer to see it differently.” The little fairy thought and flew to the house of the Great Wizard. “It’s… 

Image Transformations for Data Augmentation

“Professor Elara,” chirped the little kid, “Can show me the magic of Image Transformations?” and Professor Elara blinked slowly. “Ok. We begin with Scaling.” With a gentle wave of her wing, Professor Elara conjured a… 

Pixel Operations

Pixels are the smallest units of a digital image — think of them as the individual tiles in a mosaic. Each pixel holds color and intensity information, and by manipulating these values, we can transform… 

Image Formation: Pixels & Color Spaces

In a realm painted with light and shadow, there lived tiny sprites of light called Pixels. They were the weavers of the visual world, each a tiny, glowing dot of energy. The more Pixels that… 

Exploring Computer Vision & The Seeing Machine

“Professor Hoot,” Gizmo chirped, “how does this self-driving car see where it’s going?” Professor Hoot chuckled, his feathers ruffling. “Ah, that’s the magic of Computer Vision, my dear Gizmo! It’s how we teach machines to… 

error: Content is protected !!