Research and Development in Information technologies
With the growth of sensor networks, web logs, social networks or interconnected applications, data are generated continuously, at high speed, and in different formats, what exceeds the processing capabilities of traditional technologies.
The RDI team seeks to define, through case studies, new approaches for resource management, data mining and visualization, and machine learning methods.
Detecting communities in social networks
Social networks – FaceBook, Twitter, Flickr, Linkedln, … – have known a huge development in recent years, both in size and complexity, making their direct analysis difficult, if not impossible.
The RDI team develops methods for partitioning social networks in order to detect communities and understand their structure.
Identifying the community structure of complex networks makes it possible to extract functionalities useful for the visualization or the prediction of phenomena such as the diffusion of information, but also to determine the nerve centers in order to secure them.
Machine learning for satellite imagery analysis
With the influx of satellite data, the analysis and automatic interpretation of very high resolution images has become a major challenge with many applications: automatic mapping, automatic object detection, study and monitoring of urban or environmental changes, …
The RDI team works on the design of machine learning methods to automate the processing of these images.
The aim is to design, adapt and combine unsupervised learning methods and classification methods based on deep neural networks.
Distribution of Semantic Data Flows
Networks of drinking water distribution sensors, social networks or geolocation information provide inherently heterogeneous data.
In order to be able to generalize the treatments, it becomes essential to translate the data into a common language.
The technologies of Web 3.0, also known as Web of data, make it possible to translate the massive data flows into semantic flows, in order to homogenize them and facilitate more advanced treatments such as reasoning or inference.
To process in real time large volumes of semantic flows, the RDI team also works on optimized algorithms for data distribution and processing.
Recommendation systems allow to advise a user of products that he might like based on his profile, his browsing history or his virtual behavior on the web.
The RDI team is working on recommendation methods for scaling up (Big Data) and responding to the challenge of cold start, or what to recommend to an unknown user.
The “Solimobile” project made it possible to recommend services to people in need via personalized directories. The “Fiora” project aims at a personalized recommendation in e-nutrition and e-tourism through the implementation of a hybrid system combining collaborative filtering and case-based reasoning.
Resources management in self-service bike systems
Self-service bike systems are an environmentally friendly means of transport, but the availability of resources – whether bikes themselves or free seats in the stations – can be difficult to manage.
The RDI team is working to improve the reliability and attractiveness of these systems.
Based on the analysis of massive data from Vélib’trips (the Paris self-service bicycle system) over several years, the RDI team is interested in the problem of imbalance of some stations due to their strong attractiveness, the size of the bike park, and the influence of the introduction of electric bicycles.
Managing cloud resources
Cloud computing platforms represent an entire part of today’s digital economy. These systems aggregate computing resources (storage, computing) into data centers, concentrating thousands of machines in one place.
The allocation of hardware resources to applications is a major challenge of cloud systems, impacting their power consumption.
Using distributed classification and optimization techniques, the RDI team works on the creation of efficient schedulers capable of managing the allocation of resources by anticipating changes and periods of high or low consumption in order to minimize the number of machines used in these centers.