Courses & Teaching

Building Small Language Model: From Foundations to Bangla Financial Text Generation

Learn the core principles and techniques of language models while building a small Bangla language model that generates financial articles. I teach this course offline at the Institute of Information Technology, University of Dhaka. Below are some of the modules covered in the course. To enroll in the most updated course classes, contact BARTA Lab (barta-research-lab.github.io).

Module 1: Introduction to Language Models

History of NLP • Transformer Architecture • Tokenization

Module Slides:

Module 1: Introduction to Language Models

Module 2: Data Preparation Pipeline

Text Preprocessing • Underfitting, Overfitting & Just-Right Fitting • Tokenization Fundamentals • Bangla Tokenization Challenges

Module Slides:

Module 2: Data Preparation Pipeline

Module 3: Transformer Architecture

Token and positional embeddings • Self-attention mechanism with causal masking • Multi-Head Attention • Cross-Attention

Module Slides:

Module 3: Transformer Architecture

Module 4: Model Components

Feed-Forward Networks • Multi-Layer Perceptrons (MLPs) • Forward Pass • Gradient Explosion / Vanishing

Module Slides:

Module 4: Model Components

Module 5: Training, Evaluation, Generation

AdamW Optimizer Configuration • Learning Rate Scheduling • Gradient Clipping • Training Loop Implementation • Autoregressive Decoding • Temperature Tuning

Module Slides:

Module 5: Training, Evaluation, Generation