Subscribe to get access
??Subscribe to read the rest of the comics, the fun you can’t miss ??
Forward feature selection starts with an empty model and adds features one by one. At each step, the feature that improves the model performance the most is added to the model. The process continues until adding new features does not improve the model significantly.
Now, let’s step into an intuitive example of forward feature selection using a dataset where we aim to predict the weight of fish based on four potential input features: temperature, food, water cleanliness, and wind.
Subscribe to get access
Read more of this content when you subscribe today.
Step-by-Step Example of Forward Feature Selection using Adjusted R-Squared
Step 1: Start with No Features: We begin with an empty model and no features.
Step 2: Evaluate Each Feature Individually: We fit a separate simple linear regression model for each feature and evaluate their performance using adjusted R-squared. Let’s assume we have the following performance results:
- Temperature: Adjusted
- Food: Adjusted
- Water Cleanliness: Adjusted
- Wind: Adjusted
Since “Food” has the highest adjusted R-squared value, it is the most significant single predictor. We add “Food” to our model.
Step 3: Add the Best Feature: Now our model includes the feature “Food”:
Step 4: Evaluate Adding Each Remaining Feature: Next, we consider adding each of the remaining features to the current model one by one:
Adding Temperature gives and a combined Adjusted
Adding Water Cleanliness gives and combined Adjusted
Adding Wind gives and a combined Adjusted $latex R^2 = 0.60
“Temperature” adds the most value to our model when combined with “Food” (highest increase in adjusted R-squared), so we add “Temperature” to the model.
Step 5: Add the Best Feature: Now our model includes “Food” and “Temperature”:
Step 6: Evaluate Adding Each Remaining Feature: Next, we consider adding each of the remaining features to the current model:
Adding Water Cleanliness gives and a combined Adjusted
Adding Wind gives and a combined Adjusted
“Water Cleanliness” adds the most value to our model, so we add “Water Cleanliness” to the model.
Step 7: Add the Best Feature: Now our model includes “Food”, “Temperature”, and “Water Cleanliness”:
Step 8: Evaluate Adding the Last Remaining Feature: Finally, we consider adding “Wind” to the model: gives a combined Adjusted
Adding “Wind” does not significantly improve the adjusted R-squared, so we stop here.
So, the final model includes the features “Food”, “Temperature”, and “Water Cleanliness”:
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.