
Learn the core principles and techniques of language models while building a small Bangla language model that generates financial articles. I teach this course offline at the Institute of Information Technology, University of Dhaka. Below are some of the modules covered in the course. To enroll in the most updated course classes, contact BARTA Lab (barta-research-lab.github.io).
History of NLP • Transformer Architecture • Tokenization
Text Preprocessing • Underfitting, Overfitting & Just-Right Fitting • Tokenization Fundamentals • Bangla Tokenization Challenges
Token and positional embeddings • Self-attention mechanism with causal masking • Multi-Head Attention • Cross-Attention
Feed-Forward Networks • Multi-Layer Perceptrons (MLPs) • Forward Pass • Gradient Explosion / Vanishing
AdamW Optimizer Configuration • Learning Rate Scheduling • Gradient Clipping • Training Loop Implementation • Autoregressive Decoding • Temperature Tuning