Thesis Code: 26004
Thesis Type: M.Sc. in Machine Learning, Data Science, Computer Science, Mathematics, Telecommunications, or similar
Research Area: Ai, Data and Space
Requirements
- M.Sc. in Machine Learning, Data Science, Computer Science, Mathematics, Telecommunications, or similar
- Knowledge of Python
- Software development skills
- Knowledge of signals
- Basic knowleddge of natural language modelling and semantic embedding
- Basic knowleddge of retrieval
Description
Retrieval-Augmented Generation (RAG) is arguably one of the technology with most traction at the moment. Most RAG systems, however, struggle to achieve their full potential because they rely on a fixed retrieval granularity, typically retrieving passages or chunks of uniform size. This approach often leads to mismatches between the information need and the retrieved evidence: broad questions like “What are these documents about?” demand high-level summaries, while specific factoid queries like “Where was John born?” require fine-grained snippets. The challenge is to design a retrieval system that can dynamically adapt to the level of detail a query requires. This thesis asks: how can we model and predict query intent to select the appropriate retrieval granularity, and how does such an adaptive system impact answer accuracy, coherence, and efficiency compared to fixed-granularity RAG methods?
Contacts: send a resume with attached the list of exams to lorenzo.bongiovanni@linksfoundation.com specifying the thesis code and title.
