Jury :
This thesis is part of the GUIMUTEIC project, which aim is to equip museum tourist with an audio-guide enhanced by a camera.
This thesis adresses the problem of information access in mobile environment, by automaticaly providing information about museum artefacts.
To be able to give this information, we need to know when the visitor desire guidance, and what he is looking at, to give the correct response.
This raises issues of identification of points of interest, to determine the context, and identification of user gestures, to meet his demands.
As part of our project, the visitor is equipped with an embedded camera.
The goal is to provide a solution to help with the visit, developing vision methods for object identification, and gesture detection in first-person videos.
We propose in this thesis a study of the feasibility and the interest of the assistance to the visit, as well as the relevance of the gestures in the context of the interaction with an embedded system.
We propose a new approach for objects identification thanks to siamese neural networks to learn images similarity and define regions of interest.
We are also exploring the use of small networks for gesture recognition in mobility.
We present for this an architecture using new types of convolution blocks, to reduce the number of parameters of the network and allow its use on mobile processor.
To evaluate our proposals, we rely on several corpus of image search and gestures, specificaly designed to match the constraints of the project.