Hi, I'm Lia

My work lies in natural language processing, human-centered applications and secured decentralized systems, with experience in large-scale software and LLM development

Currently, I am a Research Intern at Aramco-Ithra, collaborating with global institutions including WHO, UN, Stony Brook Medicine, University of Washington, University of Geneva, Western University, University of Tokyo and research institutes from 35 countries. Previously, I worked with the United States Department of Justice - ICITAP, designed a platform for secure crowdsourced wildlife crime reporting in low-connectivity areas, leveraging custom NLP pipelines, geospatial and predictive models to analyze environmental and crime data. There, I've worked on a gaming application to educate International Youth about wildlife crime and biodiversity conservation.

I am currently a final-year software engineering undergraduate student at University of Dhaka where I work in BARTA Lab. There, I focus on low-resource and small-language-model development, design datasets, techniques, and educational resources . I also serve as an instructor at BARTA, where I teach a language model building course. At BanglaLLM, I work with amazing researchers and developers building open-source language models for low-resource Bangla language. This year, I am also serving as an Instructor for International AI Olympiad, teaching AI Recommender Systems.

Entrepreneurially, I am a founding researcher of Perspectivity - Drishtikon, the first real-time AI news aggregator for Bangla, featuring multi-axis bias detection, news summarization, and interactive bots that empower citizens with nuanced, research-backed insights.

And... I paint. Some like to call me an artist but I am just someone who expresses this way.

Get in Touch

Courses & Teaching

Building Small Language Model: From Foundations to Bangla Financial Text Generation

Learn the core principles and techniques of language models while building a small Bangla language model that generates financial articles. I teach this course offline at the Institute of Information Technology, University of Dhaka. Below are some of the modules covered in the course. To enroll in the most updated course classes, contact BARTA Lab (barta-research-lab.github.io).

Module 1: Introduction to Language Models

History of NLP • Transformer Architecture • Tokenization

Module Slides:

Module 1: Introduction to Language Models

Module 2: Data Preparation Pipeline

Text Preprocessing • Underfitting, Overfitting & Just-Right Fitting • Tokenization Fundamentals • Bangla Tokenization Challenges

Module Slides:

Module 2: Data Preparation Pipeline

Module 3: Transformer Architecture

Token and positional embeddings • Self-attention mechanism with causal masking • Multi-Head Attention • Cross-Attention

Module Slides:

Module 3: Transformer Architecture

Module 4: Model Components

Feed-Forward Networks • Multi-Layer Perceptrons (MLPs) • Forward Pass • Gradient Explosion / Vanishing

Module Slides:

Module 4: Model Components

Module 5: Training, Evaluation, Generation

AdamW Optimizer Configuration • Learning Rate Scheduling • Gradient Clipping • Training Loop Implementation • Autoregressive Decoding • Temperature Tuning

Module Slides:

Module 5: Training, Evaluation, Generation

Research

Read Between the Lines: A Benchmark for Uncovering Political Bias in Bangla News Articles

Nusrat Jahan Lia; Shubhashis Roy Dipta, PhD; Dr. Abdullah Khan Zehady; Naymul Islam; Madhusodan Chakraborty; Abdullah Al Wasif

Associated Organizations:

University of Dhaka

University of Maryland

Perspectivity

Accepted: AACL IJCNLP BLP; Published: ACL Anthology2025

Exploring Cross-Lingual Knowledge Transfer via Transliteration-Based MLM Fine-Tuning for Critically Low-resource Chakma Language

Adity Khisa; Nusrat Jahan Lia; Tasnim Mahfuz Nafis; Zarif Masud; Tanzir Pial, PhD; Dr.Shebuti Rayana; Dr.Ahmedul Kabir

Associated Organizations:

University of Dhaka

BARTA

State University of New York, Old Westbury

Stony Brook University

Toronto Metropolitan University

Accepted: AACL IJCNLP BLP; Published: ACL Anthology2025

Auditing Reciprocal Sentiment Alignment: Inversion Risk, Dialect Representation and Intent Misalignment in Transformers

Nusrat Jahan Lia; Shubhashis Roy Dipta

Associated Organizations:

University of Dhaka

University of Maryland

Submitted: BiAlign 2026 CHI workshop2026

Adult Attitudes about School Smartphone Bans: A Global Survey of 35 Countries

Dimitri A. Christakis, MD, MPH; Nusrat Jahan Lia; Lauren Hale, PhD; Md Mamunur Rashid

Associated Organizations:

Renaissance School of Medicine, Stony Brook University

Seattle Children's Research Institute, Seattle Children's Hospital

University of Dhaka

ITHRA, King Abdulaziz Center for World Culture, Dammam

Accepted, Published: The Journal of American Medical Association; doi: 10.1001/jamapediatrics.2025.57362025

Does Gaming Disorder Symptom Status Predict Poorer Sleep Quality?

Nusrat Jahan Lia; Lauren Hale, PhD; Justin Thomas, PhD; Dimitri A. Christakis, MD, MPH; Mamunar Rashid, PhD

Associated Organizations:

University of Dhaka

Renaissance School of Medicine, Stony Brook University

Seattle Children's Research Institute, Seattle Children's Hospital

University of Dhaka

ITHRA, King Abdulaziz Center for World Culture, Dammam

Accepted: World Sleep 2025, Singapore2025

Does Spending "Too Much Time Online" Predict Sleep Health and Mental Health?

Lauren Hale, PhD; Nusrat Jahan Lia; Sohailul Islam Alvi; Gina Marie Mathew, PhD; Dimitri A. Christakis, MD, MPH; Mamunar Rashid, PhD; Yasmin Aljedawi; Melisa Valle, PhD

Associated Organizations:

Renaissance School of Medicine, Stony Brook University

Seattle Children's Research Institute, Seattle Children's Hospital

University of Dhaka

ITHRA, King Abdulaziz Center for World Culture, Dammam

Accepted: Association of Professional Sleep Societies. Seattle, Washington, USA2025

International Public Opinion on Digital Media Use for Youth and Schools

Lauren Hale, PhD; Nusrat Jahan Lia; Sohailul Islam Alvi; Gina Marie Mathew, PhD; Dimitri A. Christakis, MD, MPH; Mamunar Rashid, PhD; Yasmin Aljedawi; Melisa Valle, PhD

Associated Organizations:

Renaissance School of Medicine, Stony Brook University

Seattle Children's Research Institute, Seattle Children's Hospital

University of Dhaka

ITHRA, King Abdulaziz Center for World Culture, Dammam

Accepted: Digital Media and Developing Minds International Scientific Congress, Washington DC2025

Evaluating the inclusivity and accessibility of educational apps (games) on the Google Play Store.

Nusrat Jahan Lia; Nahida Sultana; Sabrina Shajin Alam, PhD; Mamunar Rashid, PhD; Aymaan Islam

Associated Organizations:

University of Dhaka

Western University, CA

ITHRA, King Abdulaziz Center for World Culture, Dammam

Reviewed; In Progress: American Educational Research Association (AERA)2026

A Comprehensive Evaluation of the Educational Apps in the Google Play Store: An Exploratory Study

Nahida Sultana; Sabrina Shajin Alam, PhD; Nusrat Jahan Lia; Mamunar Rashid, PhD; Aymaan Islam

Associated Organizations:

University of Dhaka

Western University, CA

ITHRA, King Abdulaziz Center for World Culture, Dammam

Reviewed; In Progress: American Educational Research Association (AERA)2026

Work Experience

Academic

Current

Research Intern

Aramco-Ithra

Mar 2025 - Present

Led projects and worked in collaboration with WHO, Stony Brook Medicine, McGill University, University of Geneva, University of Tokyo and other institutions (from around 35 countries)
Engineered Knowledge Graphs integrating worldwide data on Digital Health and Technology Usage to enable semantic analysis and cross-country insights.
Developed coding schemes and agentic LLMs to evaluate educational games on the Google Play Store.
Co-authored 6 research papers on digital well-being, education and technology usage.

Academic

Current

Instructor: AI Recommender Systems

International AI Olympiad

Sep 2025 - Present

Teaching: Covers collaborative and content-based recommendation systems, including similarity metrics, feature engineering, hybrid methods, matrix factorization, and deep learning approaches for personalized recommendations.

Government-Affiliated

WPA Software Engineer Intern

United States Department of Justice - ICITAP

Jul 2024 - Dec 2024

Designed a mobile-first, crowdsourced wildlife crime reporting platform tailored for rural and low-connectivity environments in the Sundarbans.
Handled sparse and noisy community reports by developing custom NLP pipelines and geospatial models optimized for low-resource inputs
Leveraged machine learning to analyze spatial crime data and forecast environmental degradation hotspots
Designed and developed a gaming application to educate Bangladeshi and International Youth about wildlife and biodiversity conservation and emphasize long-term stewardship ethics

Academic

Current

Undergrad Researcher

Bangla Artificial Intelligence Research, Tools and Application (BARTA)

Oct 2024 - Present

Co-authored the first paper on Chakma-language Knowledge transfer using MLM. Developing dataset and techniques for indigenous language (like Chakma) models.
Designed and directed Small-language-model building course as an instructor of BARTA
Developed Educational Resource Allocation AI-Agent for the Government of Bangladesh.

Entrepreneurial

Current

Founding Researcher

Perspectivity - Drishtikon

Sep 2024 - Present

The first News Aggregation AI agent for Bangla news with the plan of future expansion to other low-resource languages
Has research-backed multi-axis bias-analysis to empower citizens to make informed decisions
Built-in news-summarizer agent and interactive chatbot to know about news in detail
Shows local and international news trends in real-time

Academic

Current

Member

BanglaLLM

Jul 2024 - Present

BanglaLLM introduced many of the first open-source bangla language models

Industrial

Contractual LLM/AI Engineer

Global MicroLearning Solutions

Aug 2025 - Oct 2025

I designed and deployed large-scale AI Agents for field support and turbine-engineering solutions

Blogs

Just my random thoughts, opinions, curiosities, and questions. My blogs are pretty conversational... I write just how I would speak. It's a Rubber-ducking session to me.

Thinking aloud Federated Code Intelligence: Privacy, Retrieval, and Knowledge Asymmetry

What if our most fundamental assumption about environmental economics, that innovation leads to sustainability, is fundamentally flawed?

A data driven analysis on development pathways for nations.

Apr 22, 2025Read More

Can Diffusion Models Reshape Privacy Boundaries?

Exploring How Diffusion Models Challenge and Redefine Privacy in AI-Generated Data

Mar 12, 2025Read More

Life Events

A limited subset of spatiotemporal phenomena.

Teaching foundational Bangla Natural Language Processing as an Instructor of Barta at Institute of Information Technology, University of Dhaka

Nov 13, 2025

Employment award by US Department of Justice: For technological innovations in the field of conservation and investigation

May 22, 2025

GOLD WINNER in National UIU CSE FEST BLOCKCHAIN OLYMPIAD

Jan 18, 2025

National Winner and Global Finalist in the IEEE IES Generative AI Challenge Hackathon 2025

Feb 18, 2025

Featured by US Embassy: For contributions in US Department of Justice's Tech-in-conservation Initiative

Oct 16, 2024

Organized Shorone Deyal: A gathering of 50+ young volunteers to engage in a clean-and-colour event

Aug 12, 2024

IUT National ICT Fest 2024: Third Runner Up

Apr 27, 2024

Organized CARAVAN-OF-BLESSING: Launched during the sudden price-hike to help underprivileged families meet daily needs

Mar 30, 2024

Organized ITverse-2023: One of Bangladesh's Largest Tech Events

Nov 7, 2023

Organized and Hosted FlutterFrenzy: The first developers conference in Bangladesh sponsored by Google and Flutter

May 20, 2023

SUST SWE Technovent 2023: Hackathon Finalist

Jan 27, 2023

Organized Hygieia-হাইজিয়া: A hygiene awareness and free healthcare campaign for the underprivileged families

Oct 4, 2022

Hi, I'm Lia

Courses & Teaching

Building Small Language Model: From Foundations to Bangla Financial Text Generation

Module 1: Introduction to Language Models

Module Slides:

Module 2: Data Preparation Pipeline

Module Slides:

Module 3: Transformer Architecture

Module Slides:

Module 4: Model Components

Module Slides:

Module 5: Training, Evaluation, Generation

Module Slides:

Research

Read Between the Lines: A Benchmark for Uncovering Political Bias in Bangla News Articles

Exploring Cross-Lingual Knowledge Transfer via Transliteration-Based MLM Fine-Tuning for Critically Low-resource Chakma Language

Auditing Reciprocal Sentiment Alignment: Inversion Risk, Dialect Representation and Intent Misalignment in Transformers

Adult Attitudes about School Smartphone Bans: A Global Survey of 35 Countries

Does Gaming Disorder Symptom Status Predict Poorer Sleep Quality?

Does Spending "Too Much Time Online" Predict Sleep Health and Mental Health?

International Public Opinion on Digital Media Use for Youth and Schools

Evaluating the inclusivity and accessibility of educational apps (games) on the Google Play Store.

A Comprehensive Evaluation of the Educational Apps in the Google Play Store: An Exploratory Study

Work Experience

Blogs

Thinking aloud Federated Code Intelligence: Privacy, Retrieval, and Knowledge Asymmetry

The Art of Knowing When to Stop: Early Stopping in AI and Life

The Canvas of Resistance: On Differentiation, Algorithms, and the Mathematics of Justice

Breaking Down Language Barriers: How AI Can Learn to Fix Bangla Grammar

How AI is Learning to Spot Bias (And Why It Matters More Than Ever)

What if our most fundamental assumption about environmental economics, that innovation leads to sustainability, is fundamentally flawed?

Can Diffusion Models Reshape Privacy Boundaries?

Life Events

Teaching foundational Bangla Natural Language Processing as an Instructor of Barta at Institute of Information Technology, University of Dhaka

Employment award by US Department of Justice: For technological innovations in the field of conservation and investigation

GOLD WINNER in National UIU CSE FEST BLOCKCHAIN OLYMPIAD

National Winner and Global Finalist in the IEEE IES Generative AI Challenge Hackathon 2025

Featured by US Embassy: For contributions in US Department of Justice's Tech-in-conservation Initiative

Organized Shorone Deyal: A gathering of 50+ young volunteers to engage in a clean-and-colour event

IUT National ICT Fest 2024: Third Runner Up

Organized CARAVAN-OF-BLESSING: Launched during the sudden price-hike to help underprivileged families meet daily needs

Organized ITverse-2023: One of Bangladesh's Largest Tech Events

Organized and Hosted FlutterFrenzy: The first developers conference in Bangladesh sponsored by Google and Flutter

SUST SWE Technovent 2023: Hackathon Finalist

Organized Hygieia-হাইজিয়া: A hygiene awareness and free healthcare campaign for the underprivileged families