AI – Page 10 – Knowledge sparks

s-Permutation

by Kurious Fox
July 18, 2024February 15, 2026

Example: password generation. Suppose you are generating a password using the characters A, B, and C. The password must be 3 characters long, and each character can be used once. Here, the S-permutation will be…

Quizzes about the product rule

by Kurious Fox
July 15, 2024July 15, 2024

Question 1: There are 4 different types of shirts and 3 different types of pants. How many different outfits can you make with one shirt and one pair of pants? A. 7B. 10C. 12D. 15…

Pages: 1 2

Quizzes: Complementary rules of probability

by Kurious Fox
July 15, 2024July 15, 2024

More quizzes In a survey, 80% of respondents prefer Brand A over Brand B. If a respondent is selected at random, what is the probability that they prefer Brand B? In a quality control test,…

Quizzes: Classical definition of probability

by Kurious Fox
July 15, 2024July 15, 2024

If you flip a fair coin, what is the probability that it will land on heads? In a class of 30 students, 18 are girls and 12 are boys. What is the probability that a…

Encoding categorical data in python

by Kurious Fox
July 10, 2024October 12, 2024

Handling categorical data involves several steps to convert it into a format that machine learning algorithms can process effectively. Here are common methods used to handle categorical data: 1. Label Encoding Label encoding converts categorical…

example of stack generalization using K-Nearest Neighbors (KNN) and Random Forest + Python codes

by Kurious Fox
July 9, 2024August 18, 2024

This example demonstrates the basic steps of stack generalization with two classifiers (KNN and Random Forest) and a Logistic Regression model as the meta-learner. The predictions of the base models on the training data are…

Kernel tricks, SVM properties & kernel choice

by Kurious Fox
June 30, 2024February 15, 2026

Some popular types of kernels in SVM: 1. Linear Kernel 2. Polynomial Kernel 3. Radial Basis Function (RBF) Kernel (Gaussian Kernel) 4. Sigmoid Kernel Visualizing the decision boundaries To visualize the decision boundaries, we’ll use…

Exponential distribution

by Kurious Fox
June 26, 2024February 16, 2026

This song helps us better remember the properties of the exponential distribution. The exponential distribution models time between events in a Poisson process, where occurrences are independent at a constant rate. Key features include its probability density and cumulative distribution functions, mean, variance, and memoryless property. It has applications in queueing theory, reliability engineering, and survival analysis.

Phân ph?i m?

by Kurious Fox
June 25, 2024August 18, 2024

Phân ph?i m? (exponential distribution) là m?t phân ph?i xác su?t quan tr?ng trong lý thuy?t xác su?t và th?ng kê. Nó ???c s? d?ng ?? mô t? th?i gian gi?a các s? ki?n x?y…

Logistic Regression: method + Python & R codes

by Kurious Fox
June 23, 2024February 15, 2026

Logistic regression & Bernoulli distribution Logistic regression is a statistical method used for analyzing datasets in which there are one or more independent variables that determine an outcome. The outcome is typically a binary variable,…

AIC and BIC for Feature Selection

by Kurious Fox
June 22, 2024February 15, 2026

Akaike Information Criterion (AIC) Bayesian Information Criterion (BIC) Comparison and Use in Feature Selection By applying AIC and BIC in feature selection, we can make informed decisions about which features to include in their models,…

KNN classification: practical notices & implementation using Python & R

by Kurious Fox
June 22, 2024February 15, 2026

We should normalize or standardize data before applying KNN because the algorithm is distance-based, and unscaled features can distort distance calculations, leading to biased results. In this example, we’ll use the Iris dataset, which is…

K-Nearest Neighbors (KNN): an introduction

by Kurious Fox
June 22, 2024February 15, 2026

K-Nearest Neighbors (KNN) is a popular algorithm used for both classification and regression tasks. In KNN, the output is a class membership, which is assigned based on the majority of the k nearest data points.…

Linear Discriminant Analysis Implementation in Python & R

by Kurious Fox
June 21, 2024August 18, 2024

Linear Discriminant Analysis (LDA) is a classifier that creates a linear decision boundary by fitting class-conditional densities to the data and applying Bayes’ rule. The model assumes that each class follows a Gaussian distribution with…

Stepwise Feature Selection +example

by Kurious Fox
June 20, 2024February 15, 2026

Stepwise feature selection is a systematic approach to identifying the most relevant features for a predictive model by combining both forward and backward selection techniques. The process begins with either an empty model. Then, we…

Backward feature selection + example

by Kurious Fox
June 20, 2024February 15, 2026

Backward feature selection involves iteratively removing the least significant feature from a model based on adjusted R-squared. In this example, we are predicting nuts collected by squirrels, features like temperature and rainfall are chosen as significant predictors through this method. The process aims to finalize a model with the most influential features.

Forward feature selection: a step by step example

by Kurious Fox
June 20, 2024February 15, 2026

Forward feature selection starts with an empty model and adds features one by one. At each step, the feature that improves the model performance the most is added to the model. The process continues until…

ElasticNet Regression: Method & Codes

by Kurious Fox
June 18, 2024February 15, 2026

ElasticNet regression is a regularized regression method that linearly combines both L1 and L2 penalties of the Lasso and Ridge methods. This allows it to perform both feature selection (like Lasso) and maintain some of…

Ridge regression: method & R codes

by Kurious Fox
June 18, 2024February 15, 2026

Motivation Now, recall that for LASSO Ridge Regression: Ridge regression: Ridge adds the penalty, which is the sum of the squares of the coefficients, to the loss function in linear regression. Ridge regression shrinks the…

Lasso Regression and LassoCV: methods & Python codes

by Kurious Fox
June 18, 2024April 22, 2025

The Lasso (Least Absolute Shrinkage and Selection Operator) is a regression technique that enhances prediction accuracy and interpretability by applying L1 regularization to shrink coefficients. Unlike traditional regression methods, Lasso forces some coefficients to become…

Simple linear regression using train-test split in Python & R

by Kurious Fox
June 17, 2024October 12, 2025

An example of performing simple linear regression using train-test split where the process is as follows, 1. Generate a synthetic dataset: 2. Split the dataset: We use train_test_split to divide the data into training and…

Combining datasets to increase sample size

by Kurious Fox
June 16, 2024December 31, 2024

Detailed information can be found in Combining datasets to improve model fitting or its presentation slide. Summary: The key points of the paper titled “Combining Datasets to Improve Model Fitting” are as follows: Problem and…

Expectation Maximization (EM) & implementation

by Kurious Fox
June 14, 2024October 12, 2024

Expectation Maximization (EM) is an iterative algorithm used for finding maximum likelihood estimates of parameters in statistical models, particularly when the model involves latent variables (variables that are not directly observed). The algorithm is commonly…

A comic guide to denoising noisy data

by Kurious Fox
June 13, 2024October 12, 2024

Handling noisy data is a crucial step in data preprocessing and analysis. In general, here are some common approaches to manage noisy data: 1. Data Cleaning 2. Data Transformation 3. Statistical Techniques 4. Machine Learning…

A comical guide to Missing Not At Random (MNAR)

by Kurious Fox
June 13, 2024October 12, 2024

Recall that Missing Not At Random (MNAR) is a type of missing data mechanism where the probability of missingness is related to the unobserved data itself. Here are some more examples of MNAR: In each…

What’s Missing at Random (MAR)?

by Kurious Fox
June 13, 2024October 12, 2025

Missing at Random (MAR) is a statistical term indicating that the likelihood of data being missing is related to some of the observed data but not to the missing data itself. This means that the…

Multiple regression analysis: waiting time to log in to Windows

by Kurious Fox
June 11, 2024August 18, 2024

Multiple regression analysis can be used to understand the relationship between the waiting time to log in to Windows (dependent variable) and several independent variables. Let’s assume we have the following independent variables: Suppose that…

The success rates of Cupid’s arrows

by Kurious Fox
June 2, 2024September 20, 2024

I advised a master’s student to use the binomial probability formula to determine the likelihood of attracting the affection of 15 girls, with Cupid’s success rate at 0.7. The analysis shows that the highest probability of success occurs when 10 girls reciprocate love, with a probability of 0.33.

Grazing the maze of probability

by Kurious Fox
May 31, 2024September 15, 2024

Supplementary materials for section Grazing the maze of probability & A random variable mood in the KSML app: Basic rules of probability: Mutually exclusive events Conditional probability for medical testing in a forestThe conditional probability…

Estimating the sparse inverse covariance matrix (precision matrix) by Graphical Lasso (with Python implementation)

by Kurious Fox
May 28, 2024May 5, 2025

Graphical Lasso, also known as GLasso, is a statistical technique used for estimating the sparse inverse covariance matrix (precision matrix) of a multivariate Gaussian distribution. Here, Sparsity means that many elements of the matrix are…

Generating missing data and evaluating missing data analysis in Python

by Kurious Fox
May 28, 2024October 12, 2024

Generating missing values Generating missing values with a given percentage of missingness for a dataframe or numpy array: Generating missing values with a given missing rate for a time series list: Calculating MSE ignoring missing…

The conditional probability of Tom finding Jerry

by Kurious Fox
May 27, 2024September 20, 2024

My all time favourite catch is “JERRY catching TOM!” ? Little Jerry is so smart, and do you know that he knows probability as well? One day, Jerry was thinking, “Hmm, every time Tom chases…

« Previous
1
…
8
9
10
11
Next »