LARGE LANGUAGE MODELS
LARGE LANGUAGE MODELS
In this course, students will explore the theoretical foundations of Distributional Semantics and its connection with the Transformer architecture, the workhorse of modern NLP and AI.
Students will gain hands-on experience in building classical distributional semantic models, such as Word2Vec, before diving into the full development life cycle of decoder-only architectures, from pre-training to fine-tuning and evaluation.
The course concludes with specialized modules covering advanced topics, including alignment techniques, parameter-efficient fine-tuning, and an exploration of key challenges currently faced by the research community.
- In collaboration with: Andrea Pedrotti (ISTI - CNR)
- Pre-requisites: Python proficiency, Basic Linear-Algebra
- Estimated time: ≈ 6h