Learning with Privileged Information and Distillation for Multimodal Video Classification
Vittorio Murino, University of Verona, Italy
There's Plenty of Room at the Bottom: Opportunites and Challenges for Microrobotics
Arianna Menciassi, Scuola Superiore Sant'Anna of Pisa, Italy
Physical Models and Machine Learning for Photography and Astronomy
Jean Ponce, Ecole normale supérieure-PSL and New York University, France
Learning with Privileged Information and Distillation for Multimodal Video Classification
Vittorio Murino
University of Verona
Italy
https://www.vittoriomurino.com/
Brief Bio
Vittorio Murino is full professor at the University of Verona, Italy, and has also a double appointment with University of Genova. He took the Laurea degree in Electronic Engineering in 1989 and the Ph.D. in Electronic Engineering and Computer Science in 1993 at the University of Genova, Italy. From 2009 to 2019, he worked at the Istituto Italiano di Tecnologia in Genova, Italy, as founder and director of the PAVIS (Pattern Analysis and Computer Vision) department, with which he is still collaborating now as a visiting scientist. From 2019 to 2021, he worked as Senior Video Intelligence Expert at the Ireland Research Centre of Huawei Technologies (Ireland) Co., Ltd. in Dublin. His main research interests include computer vision and machine learning, nowadays focusing on deep learning approaches, domain adaptation and generalization, and multimodal learning for (human) behavior analysis and related applications, such as video surveillance and biomedical imaging. Prof. Murino is co-author of more than 400 papers published in refereed journals and international conferences, member of the technical committees of important conferences (CVPR, ICCV, ECCV, ICPR, ICIP, etc.), and guest co-editor of special issues in relevant scientific journals. He is also member of the editorial board of Computer Vision and Image Understanding and Machine Vision & Applications journals. Finally, prof. Murino is IEEE Fellow, IAPR Fellow, and ELLIS Fellow.
Abstract
Diverse input data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance. However, while a (training) dataset could be accurately designed to include a variety of sensory inputs, it is often the case
that not all modalities are available in real life (testing) scenarios, where a model has to be deployed. This raises the challenge of how to learn
robust representations leveraging multimodal data in the training stage, while considering limitations at test time, such as noisy or missing modalities.
In this talk, I will present a new approach for multimodal video action recognition, developed within the unified frameworks of distillation and
privileged information, named generalized distillation. Particularly, we consider the case of learning representations from depth and RGB videos, while relying on RGB data only at test time. We propose a new approach to train an hallucination network that learns to distill depth features through multiplicative connections of spatio-temporal representations, leveraging soft labels and hard labels, as well as distance between feature maps. Subsequently, we improve the hallucination model to distill depth information via adversarial learning, resulting in a clean approach without several losses to balance or hyperparameters to tune. We report state-of-the-art results on video action classification on multimodal datasets such as NTU RGB+D, UWA3DII, and Northwestern-UCLA.
There's Plenty of Room at the Bottom: Opportunites and Challenges for Microrobotics
Arianna Menciassi
Scuola Superiore Sant'Anna of Pisa
Italy
Brief Bio
Arianna Menciassi was born in Pisa, Italy, in 1971. She graduated in Physics at the Pisa University (1995), she obtained the PhD (1999) at Scuola Superiore Sant’Anna (SSSA, Pisa, Italy) and she was visiting professor in different universities in France since 2014 (Pierre and Marie Curie, in Paris, Besancon University, in Besancon). She is Full Professor of Biomedical Robotics at SSSA and team leader of the “Surgical Robotics & Allied Technologies” Area at The BioRobotics Institute. She is the Coordinator of the PhD in Biorobotics since 2018, and she was appointed in 2019 as Vice-Rector of the Scuola Sant'Anna. Her main research interests involve surgical robotics, microrobotics for biomedical applications, biomechatronic artificial organs, smart and soft solutions for biomedical devices. She pays a special attention to the combination between traditional robotics, targeted therapy and wireless solution for therapy (e.g. ultrasound- and magnetic-based). She served in the Editorial Board of the IEEE-ASME Trans. on Mechatronics and she has been Topic Editor of the International Journal of Advanced Robotic Systems (2013-2020). In 2018 she has been appointed as Editor of APL Bioengineering and of the IEEE Transactions on Medical Robotics and Bionics. She is Associate Editor for Soft Robotics and she serves as Associate Editor of the IEEE Trans. on Robotics from Jan. 2021. She is Co-Chair of the IEEE Technical Committee on Surgical Robotics. She is serving in the Steering Committee of iSMIT. She received the Well-tech Award (Milan, Italy) for her researches on endoscopic capsules, and she was awarded by the Tuscany Region with the Gonfalone D’Argento, as one of the best 10 young talents of the region. Recently, she has been awarded with the KUKA Innovation Award, for her activities on robotic assisted focused ultrasound.
Abstract
Robotics is becoming more and more pervasive, and people are already familiar with robots moving around them. On the other hand, there is another class of robots operating in areas where traditional robots and human operators cannot work. They are called microrobots and can find multiple applications for healthcare, remote inspections and environmental remediation. Microrobots require a change of paradigm in design, manufacturing and control They are difficult to be seen and difficult to be tracked and navigated. They have limitations in terms of autonomous powering and they can be even dangerous if they are lost in delicate environments, such as the vessels of the human body. This talk will present the opportunities and challenges offered by microrobotics and will focus on issues related to navigation, safe control, tracking and vision of microrobots, based on the research experience of the speaker.
Physical Models and Machine Learning for Photography and Astronomy
Jean Ponce
Ecole normale supérieure-PSL and New York University
France
Brief Bio
Jean Ponce is a Professor at Ecole Normale Supérieure - PSL, where he served as Director of the Computer Science Department from 2011 to 2017 and a Global Distinguished Professor at the Courant Institute of Mathematical Sciences and the Center for Data Science at New York University. He is also the co-founder and CEO of Enhance Lab, a startup that commercializes software for joint demosaicing, denoising, super-resolution and HDR imaging from raw photo bursts. Before joining ENS and NYU, Jean Ponce held positions at Inria, MIT, Stanford, and the University of Illinois at Urbana-Champaign, where he was a Full Professor until 2005.
Jean Ponce is an IEEE and an ELLIS Fellow, a member of the Academia Europaea, and a former Sr. member of the Institut Universitaire de France. He has served as Program and/or General Chair of all three top international Computer Vision Conferences, CVPR (1997 and 2000), ECCV (2008) and ICCV (2023, upcoming). He has also served as Sr. Editor-in-Chief of the International Journal of Computer Vision and Associate Editor for Computer Vision and Image Understanding, Foundation and Trends in Computer Graphics and Vision, the IEEE Transactions on Robotics and Automation, and the SIAM Journal on Imaging Sciences. He currently serves as Scientific Director of the PRAIRIE Interdisciplinary AI Research Institute in Paris.
Jean Ponce is the recipient of two US patents, an ERC advanced grant, the 2016 and 2020 IEEE CVPR Longuet-Higgins prizes, and the 2019 ICML test-of-time award. He is the author of "Computer Vision: A Modern Approach", a textbook translated in Chinese, Japanese, and Russian.
Abstract
We live in an era of data-driven approaches to image analysis, where modeling is sometimes considered obsolete. I will propose in this talk giving back to accurate physical models of image formation their rightful place next to machine learning in the overall processing and interpretation pipeline, and discuss two applications: super-resolution and high-dynamic range imaging from raw photographic bursts, and exoplanet detection and characterization in direct imaging at high contrast.
This is joint work with Theo Bodrito, Yann Dubois de Mont-Marin, Thomas Eboli, Olivier Flasseur, Anne-Marie Lagrange, Maud Langlois, Bruno Lecouat and Julien Mairal.