This guide covers:
- Implementing word embeddings in PyTorch
- Training word embeddings
- Saving the trained embeddings
- Loading the saved embeddings for reuse
1. Implementing Word Embeddings in PyTorch
PyTorch provides nn.Embedding
for creating word embeddings.
import torch
import torch.nn as nn
# Define the vocabulary size and embedding dimension
vocab_size = 10 # Example vocabulary size
embedding_dim = 5 # Dimension of word vectors
# Create an embedding layer
embedding_layer = nn.Embedding(num_embeddings=vocab_size, embedding_dim=embedding_dim)
# Example input (word indices)
word_indices = torch.tensor([1, 3, 5, 7]) # Example words
# Get embeddings for input words
word_embeddings = embedding_layer(word_indices)
print(word_embeddings)
nn.Embedding(num_embeddings, embedding_dim)
: Creates an embedding matrix of size[vocab_size, embedding_dim]
word_indices
: Index values for words- The output will be a tensor of shape
[num_words, embedding_dim]
2. Training Word Embeddings in PyTorch
Training embeddings typically involves using a neural network like Skip-gram or CBOW.
Dataset Preparation
import torch
import torch.nn as nn
import torch.optim as optim
# Sample dataset (word pairs)
data = [(0, 1), (1, 2), (2, 3), (3, 4)] # (center word, context word)
vocab_size = 5 # Number of unique words
embedding_dim = 3
# Create model
class WordEmbeddingModel(nn.Module):
def __init__(self, vocab_size, embedding_dim):
super(WordEmbeddingModel, self).__init__()
self.embeddings = nn.Embedding(vocab_size, embedding_dim)
self.linear = nn.Linear(embedding_dim, vocab_size)
def forward(self, center_word):
embed = self.embeddings(center_word)
output = self.linear(embed)
return output
# Initialize model
model = WordEmbeddingModel(vocab_size, embedding_dim)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
# Training loop
for epoch in range(100):
total_loss = 0
for center, context in data:
center_tensor = torch.tensor([center], dtype=torch.long)
target_tensor = torch.tensor([context], dtype=torch.long)
optimizer.zero_grad()
output = model(center_tensor)
loss = criterion(output, target_tensor)
loss.backward()
optimizer.step()
total_loss += loss.item()
if epoch % 20 == 0:
print(f"Epoch {epoch}, Loss: {total_loss}")
- The model learns embeddings by predicting context words given a center word.
nn.CrossEntropyLoss()
is used for training.SGD
optimizer updates embeddings based on loss.
3. Saving the Trained Embedding Model
Once trained, we save the embedding layer.
# Save only the embedding layer weights
torch.save(model.embeddings.state_dict(), "word_embeddings.pth")
# Save the entire model (optional)
torch.save(model.state_dict(), "embedding_model.pth")
torch.save(model.embeddings.state_dict(), "word_embeddings.pth")
saves just the embedding weights.torch.save(model.state_dict(), "embedding_model.pth")
saves the full model.
4. Loading the Saved Embeddings
To reuse the trained embeddings:
# Load model structure
loaded_model = WordEmbeddingModel(vocab_size, embedding_dim)
# Load saved embeddings
loaded_model.embeddings.load_state_dict(torch.load("word_embeddings.pth"))
# Example usage
word_idx = torch.tensor([2]) # Example word index
print(loaded_model.embeddings(word_idx))
WordEmbeddingModel(vocab_size, embedding_dim)
: Recreate the model structure.load_state_dict(torch.load("word_embeddings.pth"))
: Load saved embeddings.- Now, embeddings can be used in further training or inference.
Implement in Colab
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.