AI-ML -Stat- CS – Page 7

Quizzes about the product rule

by Kurious Fox
July 15, 2024July 15, 2024

Question 1: There are 4 different types of shirts and 3 different types of pants. How many different outfits can you make with one shirt and one pair of pants? A. 7B. 10C. 12D. 15… Quizzes about the product rule

Pages: 1 2

Quizzes: Complementary rules of probability

by Kurious Fox
July 15, 2024July 15, 2024

More quizzes In a survey, 80% of respondents prefer Brand A over Brand B. If a respondent is selected at random, what is the probability that they prefer Brand B? In a quality control test,… Quizzes: Complementary rules of probability

Quizzes: Classical definition of probability

by Kurious Fox
July 15, 2024July 15, 2024

If you flip a fair coin, what is the probability that it will land on heads? In a class of 30 students, 18 are girls and 12 are boys. What is the probability that a… Quizzes: Classical definition of probability

Encoding categorical data in python

by Kurious Fox
July 10, 2024October 12, 2024

Handling categorical data involves several steps to convert it into a format that machine learning algorithms can process effectively. Here are common methods used to handle categorical data: 1. Label Encoding Label encoding converts categorical… Encoding categorical data in python

example of stack generalization using K-Nearest Neighbors (KNN) and Random Forest + Python codes

by Kurious Fox
July 9, 2024August 18, 2024

This example demonstrates the basic steps of stack generalization with two classifiers (KNN and Random Forest) and a Logistic Regression model as the meta-learner. The predictions of the base models on the training data are… example of stack generalization using K-Nearest Neighbors (KNN) and Random Forest + Python codes

Kernel tricks, SVM properties & kernel choice

by Kurious Fox
June 30, 2024September 9, 2024

Some popular types of kernels in SVM: 1. Linear Kernel 2. Polynomial Kernel 3. Radial Basis Function (RBF) Kernel (Gaussian Kernel) 4. Sigmoid Kernel Visualizing the decision boundaries To visualize the decision boundaries, we’ll use… Kernel tricks, SVM properties & kernel choice

Exponential distribution song

by Kurious Fox
June 26, 2024September 20, 2024

This song helps us better remember the properties of the exponential distribution. The exponential distribution models time between events in a Poisson process, where occurrences are independent at a constant rate. Key features include its probability density and cumulative distribution functions, mean, variance, and memoryless property. It has applications in queueing theory, reliability engineering, and survival analysis.

Phân ph?i m?

by Kurious Fox
June 25, 2024August 18, 2024

Phân ph?i m? (exponential distribution) là m?t phân ph?i xác su?t quan tr?ng trong lý thuy?t xác su?t và th?ng kê. Nó ???c s? d?ng ?? mô t? th?i gian gi?a các s? ki?n x?y… Phân ph?i m?

Logistic Regression: method + Python & R codes

by Kurious Fox
June 23, 2024September 9, 2024

Logistic regression is a statistical method used for analyzing datasets in which there are one or more independent variables that determine an outcome. The outcome is typically a binary variable, meaning it has two possible… Logistic Regression: method + Python & R codes

AIC and BIC for Feature Selection

by Kurious Fox
June 22, 2024April 25, 2025

Akaike Information Criterion (AIC) Bayesian Information Criterion (BIC) Comparison and Use in Feature Selection By applying AIC and BIC in feature selection, we can make informed decisions about which features to include in their models,… AIC and BIC for Feature Selection

KNN classification: practical notices & implementation using Python & R

by Kurious Fox
June 22, 2024September 9, 2024

We should normalize or standardize data before applying KNN because the algorithm is distance-based, and unscaled features can distort distance calculations, leading to biased results. In this example, we’ll use the Iris dataset, which is… KNN classification: practical notices & implementation using Python & R

K-Nearest Neighbors (KNN): an introduction

by Kurious Fox
June 22, 2024September 9, 2024

K-Nearest Neighbors (KNN) is a popular algorithm used for both classification and regression tasks. In KNN, the output is a class membership, which is assigned based on the majority of the k nearest data points.… K-Nearest Neighbors (KNN): an introduction

Linear Discriminant Analysis Implementation in Python & R

by Kurious Fox
June 21, 2024August 18, 2024

Linear Discriminant Analysis (LDA) is a classifier that creates a linear decision boundary by fitting class-conditional densities to the data and applying Bayes’ rule. The model assumes that each class follows a Gaussian distribution with… Linear Discriminant Analysis Implementation in Python & R

Stepwise Feature Selection +example

by Kurious Fox
June 20, 2024September 9, 2024

Stepwise feature selection is a systematic approach to identifying the most relevant features for a predictive model by combining both forward and backward selection techniques. The process begins with either an empty model. Then, we… Stepwise Feature Selection +example

Backward feature selection + example

by Kurious Fox
June 20, 2024September 9, 2024

Backward feature selection involves iteratively removing the least significant feature from a model based on adjusted R-squared. In this example, we are predicting nuts collected by squirrels, features like temperature and rainfall are chosen as significant predictors through this method. The process aims to finalize a model with the most influential features.

Forward feature selection: a step by step example

by Kurious Fox
June 20, 2024September 9, 2024

Forward feature selection starts with an empty model and adds features one by one. At each step, the feature that improves the model performance the most is added to the model. The process continues until… Forward feature selection: a step by step example

ElasticNet Regression: Method & Codes

by Kurious Fox
June 18, 2024September 9, 2024

ElasticNet regression is a regularized regression method that linearly combines both L1 and L2 penalties of the Lasso and Ridge methods. This allows it to perform both feature selection (like Lasso) and maintain some of… ElasticNet Regression: Method & Codes

Ridge regression: method & R codes

by Kurious Fox
June 18, 2024April 20, 2025

Motivation Now, recall that for LASSO Ridge Regression: Ridge regression: Ridge adds the penalty, which is the sum of the squares of the coefficients, to the loss function in linear regression. Ridge regression shrinks the… Ridge regression: method & R codes

Lasso Regression and LassoCV: methods & Python codes

by Kurious Fox
June 18, 2024April 22, 2025

The Lasso (Least Absolute Shrinkage and Selection Operator) is a regression technique that enhances prediction accuracy and interpretability by applying L1 regularization to shrink coefficients. Unlike traditional regression methods, Lasso forces some coefficients to become… Lasso Regression and LassoCV: methods & Python codes

Simple linear regression using train-test split in Python & R

by Kurious Fox
June 17, 2024October 12, 2025

An example of performing simple linear regression using train-test split where the process is as follows, 1. Generate a synthetic dataset: 2. Split the dataset: We use train_test_split to divide the data into training and… Simple linear regression using train-test split in Python & R

Expectation Maximization (EM) & implementation

by Kurious Fox
June 14, 2024October 12, 2024

Expectation Maximization (EM) is an iterative algorithm used for finding maximum likelihood estimates of parameters in statistical models, particularly when the model involves latent variables (variables that are not directly observed). The algorithm is commonly… Expectation Maximization (EM) & implementation

A comic guide to denoising noisy data

by Kurious Fox
June 13, 2024October 12, 2024

Handling noisy data is a crucial step in data preprocessing and analysis. In general, here are some common approaches to manage noisy data: 1. Data Cleaning 2. Data Transformation 3. Statistical Techniques 4. Machine Learning… A comic guide to denoising noisy data

A comical guide to Missing Not At Random (MNAR)

by Kurious Fox
June 13, 2024October 12, 2024

Recall that Missing Not At Random (MNAR) is a type of missing data mechanism where the probability of missingness is related to the unobserved data itself. Here are some more examples of MNAR: In each… A comical guide to Missing Not At Random (MNAR)

What’s Missing at Random (MAR)?

by Kurious Fox
June 13, 2024October 12, 2025

Missing at Random (MAR) is a statistical term indicating that the likelihood of data being missing is related to some of the observed data but not to the missing data itself. This means that the… What’s Missing at Random (MAR)?

Multiple regression analysis: waiting time to log in to Windows

by Kurious Fox
June 11, 2024August 18, 2024

Multiple regression analysis can be used to understand the relationship between the waiting time to log in to Windows (dependent variable) and several independent variables. Let’s assume we have the following independent variables: Suppose that… Multiple regression analysis: waiting time to log in to Windows

The success rates of Cupid’s arrows

by Kurious Fox
June 2, 2024September 20, 2024

I advised a master’s student to use the binomial probability formula to determine the likelihood of attracting the affection of 15 girls, with Cupid’s success rate at 0.7. The analysis shows that the highest probability of success occurs when 10 girls reciprocate love, with a probability of 0.33.

Grazing the maze of probability

by Kurious Fox
May 31, 2024September 15, 2024

Supplementary materials for section Grazing the maze of probability & A random variable mood in the KSML app: Basic rules of probability: Mutually exclusive events Conditional probability for medical testing in a forestThe conditional probability… Grazing the maze of probability

Estimating the sparse inverse covariance matrix (precision matrix) by Graphical Lasso (with Python implementation)

by Kurious Fox
May 28, 2024May 5, 2025

Graphical Lasso, also known as GLasso, is a statistical technique used for estimating the sparse inverse covariance matrix (precision matrix) of a multivariate Gaussian distribution. Here, Sparsity means that many elements of the matrix are… Estimating the sparse inverse covariance matrix (precision matrix) by Graphical Lasso (with Python implementation)

Generating missing data and evaluating missing data analysis in Python

by Kurious Fox
May 28, 2024October 12, 2024

Generating missing values Generating missing values with a given percentage of missingness for a dataframe or numpy array: Generating missing values with a given missing rate for a time series list: Calculating MSE ignoring missing… Generating missing data and evaluating missing data analysis in Python

The conditional probability of Tom finding Jerry

by Kurious Fox
May 27, 2024September 20, 2024

My all time favourite catch is “JERRY catching TOM!” ? Little Jerry is so smart, and do you know that he knows probability as well? One day, Jerry was thinking, “Hmm, every time Tom chases… The conditional probability of Tom finding Jerry

Introduction to Principal Component Analysis (PCA) and implementation in R and Python

by Kurious Fox
May 26, 2024November 24, 2024

Principal Component Analysis (PCA) is a statistical technique used for dimensionality reduction, which simplifies the complexity in high-dimensional data while retaining important infomation. The basic idea of this method is to transform a large set… Introduction to Principal Component Analysis (PCA) and implementation in R and Python

Bayes theorem in finance of a magical forest

by Kurious Fox
May 25, 2024August 18, 2024

Here, we denote by the event NOT . Example 1: Magical Investment Returns In the magical forest, gnomes invest in enchanted acorns, which sometimes turn into golden trees. A gnome named Glim invests in an… Bayes theorem in finance of a magical forest

« Previous
1
…
5
6
7
8
Next »