Person in charge: Raja CHIKY

Prerequisites: II.2413, II.2414

Organization: Courses + Labs + presentations

Evaluation: Project (60%) + Presentation (20%) + Labs (20%)

Credits: 5 ECTS


Context

The information landscape has been transformed through the web. This allowed the exchange and interconnection of documents. Thus, the web evolved by its nature, its structure and its use to move towards the web of data also called semantic Web.
The Semantic Web is a a set of technologies proposed by the World Wide Web Consortium (W3C) that allow to exchange data. It is a "smart Web", where information is interpreted by computers (browsers) in order to extract the interest of the users.
In this module, we will introduce existing tools for managing unstructured or semi-structured data. We will also study information retrieval (RI), which involves searching for unstructured data sets. Finally, we will learn how to represent and link data to make it easily accessible and reusable.

Objectives

Skills

The aim of the course is to introduce the tools for representing semi-structured documents, information retrieval (to understand the functioning of search engines such as Google), and the Semantic Web. Students will be able to apply them to solve scientific problems related to data interoperability and information exploitation by machines, data querying in various applications.

Knowledges

Concepts
  • XML, Xquery, Xpath, XSLT
  • RDF (Resource Description Framework) and RDFS
  • Open Linked data
  • SPARQL
  • Triple stores
  • OWL
  • TF.IDF
  •  ElasticSearch, Solr

 

Know-how
  • Use information retrieval techniques (textual search, textual indexing)
  • Represent documents in XML format and querying by Xpath and Xquery
  • Know how to represent the raw data in the form of RDF triplets
  • Know how to use SPARQL to find relevant information

Pedagogical approach

The module is organized as follows: a lecture or a tutorial per week. Students have to realize a final project related to the implementation of a knowledge management tool or to a bibliographic research.

 

References 

Web Data Management, Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart; Published by Cambridge University Press 2012

Elasticsearch - The Definitive Guide , C. Gormley and Z. tong, O’Reilly, 2015

Solr in action – Trey Grainger and Timothey Potter, Manning Publications 2014