Thesis code: 20006

Thesis Type: M.Sc. thesis in Machine Learning, Data Science, Computer Science, Mathematics, or equivalent

Research Area: Data Science for Industrial and Societal Application


  • Knowledge of Python
  • Software development skills
  • Basic concepts on data science, concerning data analysis, processing and machine learning
  • Basic concepts on Natural Language Processing



Nowadays, the amount of information available is, very often, much higher than the time available for manual inspection: this is even more true in contexts implying decisions, which need a wide understanding of the involved domain. Part of this knowledge is concealed in unstructured sources, such as text, that can be processed and transformed in logical structures through Natural Language Processing.

The objective of this thesis consists in the study and implementation of machine learning and/or deep learning algorithms related to natural language processing. The proposed algorithms will be trained using open data, available through APIs. The candidate will have both the task of collecting the data and evaluating the best algorithms to apply for the case of study. The goal is the extraction of concepts related to numerical indicators from the examined documents and infer new knowledge through statistical analysis.

The work has to be performed with NLP algorithms including deep learning algorithms using a popular framework (i.e. TensorFlow, PyTorch, Keras).

Contact: send a resume with attached the list of exams to specifying the thesis code and title.