Skip to Main content Skip to Navigation

Codage Ambisonique pour les Communications Immersives

Abstract : This thesis takes place in the context of the spread of immersive content. For the last couple of years, immersive audio recording and playback technologies have gained momentum and have become more and more popular. New codecs are needed to handle those spatial audio formats, especially for communication applications. There are several ways to represent spatial audio scenes. In this thesis, we focused on First Order Ambisonic. The first part of our research focused on improving multi-mono coding by decorrelated each ambisonic signal component before the multi-mono coding. To guarantee signal continuity between frames, efficient quantization new mechanisms are proposed. In the second part of this thesis, we proposed a new coding concept using a powermap to recreate the original spatial image. With this concept, we proposed two compressing methods. The first one is a post-processing focused on limiting the spatial distortion of the decoded signal. The spatial correction is based on the difference between the original and the decoded spatial image. This post-processing is later extended to a parametric coding method. The last part of this thesis presents a more exploratory method. This method studied audio signal compression by neural networks inspired by image compression models using variational autoencoders.
Complete list of metadata
Contributor : Pierre MAHE Connect in order to contact the contributor
Submitted on : Thursday, March 17, 2022 - 4:50:10 PM
Last modification on : Friday, June 3, 2022 - 10:24:20 AM
Long-term archiving on: : Saturday, June 18, 2022 - 7:39:05 PM


Files produced by the author(s)


  • HAL Id : tel-03612363, version 1



Pierre Mahe. Codage Ambisonique pour les Communications Immersives. Son [cs.SD]. Université de La Rochelle, 2022. Français. ⟨tel-03612363⟩



Record views


Files downloads