Skip to content

Linear Discriminant Analysis Implementation in Python & R

Linear Discriminant Analysis (LDA) is a classifier that creates a linear decision boundary by fitting class-conditional densities to the data and applying Bayes’ rule. The model assumes that each class follows a Gaussian distribution with… 

Stepwise Feature Selection +example

Stepwise feature selection is a systematic approach to identifying the most relevant features for a predictive model by combining both forward and backward selection techniques. The process begins with either an empty model. Then, we… 

Backward feature selection + example

Backward feature selection involves iteratively removing the least significant feature from a model based on adjusted R-squared. In this example, we are predicting nuts collected by squirrels, features like temperature and rainfall are chosen as significant predictors through this method. The process aims to finalize a model with the most influential features.

Forward feature selection: a step by step example

Forward feature selection starts with an empty model and adds features one by one. At each step, the feature that improves the model performance the most is added to the model. The process continues until… 

ElasticNet Regression: Method & Codes

ElasticNet regression is a regularized regression method that linearly combines both L1 and L2 penalties of the Lasso and Ridge methods. This allows it to perform both feature selection (like Lasso) and maintain some of… 

Ridge regression: method & R codes

Motivation Now, recall that for LASSO Ridge Regression: Ridge regression: Ridge adds the penalty, which is the sum of the squares of the coefficients, to the loss function in linear regression. Ridge regression shrinks the… 

Lasso Regression and LassoCV: methods & Python codes

The Lasso (Least Absolute Shrinkage and Selection Operator) is a regression technique that enhances prediction accuracy and interpretability by applying L1 regularization to shrink coefficients. Unlike traditional regression methods, Lasso forces some coefficients to become… 

How to Write a Proof in a paper

Reviewers are not required to read supplementary materials, but many do. Therefore, making your proof easy to read is important. General Guidelines: Example: Theorem 1: Let and be real numbers. If , then . The… 

Example of using derivatives to find optimal drug dosage

Dosage Optimization: Pharmacologists use derivatives to find the optimal drug dosage that maximizes therapeutic effects while minimizing side effects. The concentration of the drug in the bloodstream is modeled as a function of time, and… 

Example of using derivatives to optimize the material cost

Optimization of Material Usage: Engineers use derivatives to minimize the cost of materials while maintaining structural integrity. For example, determining the optimal dimensions of a container to minimize surface area for a given volume. Example:… 

Combining datasets to increase sample size

Detailed information can be found in Combining datasets to improve model fitting or its presentation slide. Summary: The key points of the paper titled “Combining Datasets to Improve Model Fitting” are as follows: Problem and… 

Calculus can hurt

Supplementary contents for section “Calculus can hurt” in the KSML app: Limit: Derivatives Integral

Expectation Maximization (EM) & implementation

Expectation Maximization (EM) is an iterative algorithm used for finding maximum likelihood estimates of parameters in statistical models, particularly when the model involves latent variables (variables that are not directly observed). The algorithm is commonly… 

A comic guide to denoising noisy data

Handling noisy data is a crucial step in data preprocessing and analysis. In general, here are some common approaches to manage noisy data: 1. Data Cleaning 2. Data Transformation 3. Statistical Techniques 4. Machine Learning… 

A comical guide to Missing Not At Random (MNAR)

Recall that Missing Not At Random (MNAR) is a type of missing data mechanism where the probability of missingness is related to the unobserved data itself. Here are some more examples of MNAR: In each… 

What’s Missing at Random (MAR)?

Missing at Random (MAR) is a statistical term indicating that the likelihood of data being missing is related to some of the observed data but not to the missing data itself. This means that the… 

Tips & Tricks for research newbies

Coding and managing projects Using AI to better code, debug and manage projects How to export an R dataframe to LaTeX Paper writing: Analyze experiment results faster with ChatGPT How to write in Latex faster… 

Check Plagiarism by grammarly

Plagiarism is the act of using someone else’s work, ideas, words, or intellectual property without proper acknowledgment or permission, and presenting it as your own. Even reusing your own previously published work or parts of… 

Using AI to better code, debug and manage projects

1. Code Completion and Suggestions AI-powered code completion tools can predict and suggest the next lines of code based on the context of your current coding. Examples include: Copilot: Uses OpenAI’s Codex to provide code… 

The success rates of Cupid’s arrows

I advised a master’s student to use the binomial probability formula to determine the likelihood of attracting the affection of 15 girls, with Cupid’s success rate at 0.7. The analysis shows that the highest probability of success occurs when 10 girls reciprocate love, with a probability of 0.33.

Grazing the maze of probability

Supplementary materials for section Grazing the maze of probability & A random variable mood in the KSML app: Basic rules of probability: Mutually exclusive events Conditional probability for medical testing in a forestThe conditional probability… 

Tricks for remembering elementary matrix operations

Connecting matrices to systems of linear equations can indeed help in better understanding and remembering the properties of matrices. By visualizing how matrix operations correspond to operations on systems of equations, abstract matrix properties become… 

error: Content is protected !!