ROBOVIS 2021 Abstracts


Area 1 - Computer Vision

Full Papers
Paper Nr: 4
Title:

Towards Fast and Automatic Map Initialization for Monocular SLAM Systems

Authors:

Blake Troutman and Mihran Tuceryan

Abstract: Simultaneous localization and mapping (SLAM) is a widely adopted approach for estimating the pose of a sensor with 6 degrees of freedom. SLAM works by using sensor measurements to initialize and build a virtual map of the environment, while simultaneously matching succeeding sensor measurements to entries in the map to perform robust pose estimation of the sensor on each measurement cycle. Markerless, single-camera systems that utilize SLAM usually involve initializing the map by applying one of a few structure-from-motion approaches to two frames taken by the system at different points in time. However, knowing when the feature matches between two frames will yield enough disparity, parallax, and/or structure for a good initialization to take place remains an open problem. To make this determination, we train a number of logistic regression models on summarized correspondence data for 927 stereo image pairs. Our results show that these models classify with significantly higher precision than the current state-of-the-art approach in addition to remaining computationally inexpensive.
Download

Paper Nr: 10
Title:

Accurate 6D Object Pose Estimation and Refinement in Cluttered Scenes

Authors:

Yixiang Jin, John A. Rossiter and Sandor M. Veres

Abstract: Estimating the 6D pose of objects is an essential part of a robot’s ability to perceive their environment. This paper proposes a method for detecting a known object and estimating its 6D pose from a single RGB image. Unlike most of the state-of-the-art methods that deploy PnP algorithms for estimating 6D pose, the method here can output the 6D pose in one step. In order to obtain estimation accuracy that is comparable to RGB-D based methods, an efficient refinement algorithm, called contour alignment (CA), is presented; this can increase the predicted 6D pose accuracy significantly. We evaluate the new method in two widely used benchmarks, LINEMOD for single object pose estimation and Occlusion-LINEMOD for multiple objects pose estimation. The experiments show that the proposed method surpasses other state-of-the-art prediction approaches.
Download

Paper Nr: 21
Title:

Real-time Detection of 2D Tool Landmarks with Synthetic Training Data

Authors:

Bram Vanherle, Jeroen Put, Nick Michiels and Frank Van Reeth

Abstract: In this paper a deep learning architecture is presented that can, in real time, detect the 2D locations of certain landmarks of physical tools, such as a hammer or screwdriver. To avoid the labor of manual labeling, the network is trained on synthetically generated data. Training computer vision models on computer generated images, while still achieving good accuracy on real images, is a challenge due to the difference in domain. The proposed method uses an advanced rendering method in combination with transfer learning and an intermediate supervision architecture to address this problem. It is shown that the model presented in this paper, named Intermediate Heatmap Model (IHM), generalizes to real images when trained on synthetic data. To avoid the need for an exact textured 3D model of the tool in question, it is shown that the model will generalize to an unseen tool when trained on a set of different 3D models of the same type of tool. IHM is compared to two existing approaches to keypoint detection and it is shown that it outperforms those at detecting tool landmarks, trained on synthetic data.
Download

Short Papers
Paper Nr: 26
Title:

Dual-context Identification based on Geometric Descriptors for 3D Registration Algorithm Selection

Authors:

Polycarpo S. Neto, José M. Soares, Michela Mulas and George P. Thé

Abstract: In 3D reconstruction applications, matching between corresponding point clouds is commonly resolved using variants of the Iterative Closest Point (ICP). However, ICP and its variants suffer from some limitations, functioning properly only for some contexts with well-behaved data distribution; outdoor scene, for example, poses many challenges. Indeed, the literature has suggested that the ability of some of these algorithms to find a match was reduced by the presence of geometric disorder in the scene, for example. This article presents a method based on the characterization of the eigentropy and omnivariance properties of clouds to indicate which variant of the ICP is best suited for each context considered here, namely, object or outdoor scene alignment. In addition to the context selector, we suggest a partitioning step prior to alignment, which in most cases allows for reduced computational cost. In summary, the proposal as a whole worked satisfactorily to the alignment as a multipurpose registration technique, serving to pose correction of data from different contexts and thus being useful for computer vision and robotics applications.
Download

Paper Nr: 28
Title:

Mesoscale Patterns Identification through SST Image Processing

Authors:

Marco Reggiannini, João Janeiro, Flávio Martins, Oscar Papini and Gabriele Pieri

Abstract: Mesoscale marine phenomena represent important features to understand and include within predictive models, which provide valuable information for proper environmental policy making. For example the rearrangement of the organic substances, consequent to the dynamics of the water masses affected by the mentioned phenomena, meaningfully modifies the actual condition of local habitats. Indeed it may facilitate the onset of non resident living species at the expense of resident ones, eventually affecting related human activity, such as commercial fishery. Objective of this work is the detection and identification of mesoscale events, in terms of specific marine surface patterns that are observed throughout such events, e.g. water filaments, counter-currents, meanders due to upwelling wind actions stress. These phenomena can be studied and monitored through the analysis of Sea Surface Temperature images captured by satellite missions, such as Metop, and MODIS Terra/Aqua. A quantitative description of such events is proposed, based on dedicated algorithms that extract temporal and spatial features from the images, and exploit them to provide a signature discriminating different observed scenarios. Preliminary results of the application of the proposed approach to a dataset related to the southwestern region of the Iberian Peninsula are presented.
Download

Paper Nr: 8
Title:

Video-based Car Make, Model and Year Recognition

Authors:

Diana A. George, Omar M. Shehata, Hossam A. El Munim and Sherif Hammad

Abstract: Fine-grained car recognition requires extracting discriminating features and certain car parts which can be used to distinguish between similar cars. This paper represents a full system for car make, model and year recognition in videos. We followed a multi-step approach for automatically detecting, tracking and recognizing them using deep Convolutional Neural Network (CNN). We also focused on the recognition stage where we managed to compare 4 state-of-the art Convolution Neural Networks and adapted them for extracting those features. Moreover, we modified the InceptionResnetv2 network and our results show our success as we managed to elevate the Top 1 accuracy to 0.8617 and Top 5 accuracy to 0.9751.
Download

Paper Nr: 13
Title:

Coordinate Attention UNet

Authors:

Quoc An Dang and Duc Dung Nguyen

Abstract: In this paper, we propose an alternative architecture based on the UNet, which utilized the attention module. Our model solved the context loss and feature dilution caused by sampling operation of the UNet model using the enhancement ability of the attention. Further more, we applied one of the latest attention module named Coordinate Attention module to our model and proposed modification of this module to improve the effective of this module for Magnetic Resonance Imaging (MRI) scans.
Download

Area 2 - Intelligent Systems

Full Papers
Paper Nr: 2
Title:

Generating Synthetic Training Data for Deep Learning-based UAV Trajectory Prediction

Authors:

Stefan Becker, Ronny Hug, Wolfgang Huebner, Michael Arens and Brendan T. Morris

Abstract: Deep learning-based models, such as recurrent neural networks (RNNs), have been applied to various sequence learning tasks with great success. Following this, these models are increasingly replacing classic approaches in object tracking applications for motion prediction. On the one hand, these models can capture complex object dynamics with less modeling required, but on the other hand, they depend on a large amount of training data for parameter tuning. Towards this end, we present an approach for generating synthetic trajectory data of unmanned-aerial-vehicles (UAVs) in image space. Since UAVs, or rather quadrotors are dynamical systems, they can not follow arbitrary trajectories. With the prerequisite that UAV trajectories fulfill a smoothness criterion corresponding to a minimal change of higher-order motion, methods for planning aggressive quadrotors flights can be utilized to generate optimal trajectories through a sequence of 3D waypoints. By projecting these maneuver trajectories, which are suitable for controlling quadrotors, to image space, a versatile trajectory data set is realized. To demonstrate the applicability of the synthetic trajectory data, we show that an RNN-based prediction model solely trained on the generated data can outperform classic reference models on a real-world UAV tracking dataset. The evaluation is done on the publicly available ANTI-UAV dataset.
Download

Short Papers
Paper Nr: 33
Title:

Intelligent Classification of Different Types of Plastics using Deep Transfer Learning

Authors:

Anthony P. Chazhoor, Manli Zhu, Edmond L. Ho, Bin Gao and Wai Lok Woo

Abstract: Plastic pollution has affected millions globally. Research shows tiny plastics in the food we eat, the water we drink, and even in the air, we breathe. An average human intakes 74,000 micro-plastic every year, which significantly affects the health of living beings. This pollution must be administered before it severely impacts the world. We have substantially compared three state-of-the-art models on the WaDaBa dataset, which contains different types of plastics. These models are capable of classifying different types of plastic wastes which can be reused or recycled, thus limiting their wastage.
Download

Area 3 - Robotics

Full Papers
Paper Nr: 23
Title:

Manipulating Deformable Objects with a Dual-arm Robot

Authors:

Stéphane Caro, Christine Chevallereau and Alberto Remus

Abstract: Competition in all sectors requires companies to be increasingly flexible to market changes and the assembly industry is no exception. The impact of this work concerns aircraft production, as well as other fields. The main focus is on modelling and control techniques to carry out assembly tasks involving deformable parts, by exploiting a multi-robot system. Specifically, two robot arms are used to move a light and deformable part in order to adapt its shape for an assembly operation. A vision system is used, assisted by markers. Furthermore, the stability of the proposed controller is analyzed and experimental results are given.
Download

Paper Nr: 25
Title:

On-orbit Free-floating Manipulation using a Two-arm Robotic System

Authors:

Jose L. Ramon, Jorge Pomares and Leonard Felicetti

Abstract: A direct visual-servoing algorithm is proposed for the control of a space-based two-arm manipulator. The scenario under consideration assumes that one of the arms performs the manipulation task while the second one has an in-hand camera to observe the target zone of manipulation. The algorithm uses both the camera images and the force/torque measurements as inputs to calculate the control action to move the arms to perform a manipulation task. The algorithm integrates the multibody dynamics of the robotic system in a visual servoing framework that uses de-localized cameras. Impedance control is then used to compensate for eventual contact reactions when the end effector touches and operates the target body. Numerical results demonstrate the suitability of the proposed algorithm in specific tasks used in on-orbit servicing operations.
Download

Short Papers
Paper Nr: 6
Title:

Portable Safety System using Radar for Flexible Human-Robot-Collaboration in a Real Semi-automated Production Line

Authors:

Christian Bergner, Ferhat Akan, Ronald Schmidt-Vollus, Peter Heß and Christian Deuerlein

Abstract: The implementation of a reliable vision system for a human-robot environment is a key issue for the collaborative production industry. The core challenge of human-robot collaboration is to ensure safety. Furthermore, a flexible safety system is required for frequently changing applications and work areas. This paper focuses on the development and application of a workspace monitoring system for safeguarding using radar sensors. The human-robot collaboration cell is designed to enable a flexible integration regardless of the work location. This results in higher productivity. Since no separating protective devices are provided for the cell, safety-oriented monitoring and control by suitable safety sensors is required. The methods to minimize the size of the necessary safety distance will be presented. The experimental validation shows that this safety system with radar sensors performs a reliable workspace monitoring system. The high robustness, reactivity and flexibility of the safety concept makes this system usable for collaborative tasks in a real industrial environment.
Download

Paper Nr: 7
Title:

Evaluating Robot Posture Control and Balance by Comparison to Human Subjects using Human Likeness Measures

Authors:

Lippi Vittorio, Christoph Maurer and Thomas Mergner

Abstract: Posture control and balance are basic requirements for a humanoid robot performing motor tasks like walking and interacting with the environment. For this reason, posture control is one of the elements taken into account when evaluating the performance of humanoids. In this work, we describe and analyze a performance indicator based on the comparison between the body sway of a robot standing on a moving surface and the one of healthy subjects performing the same experiment. This approach is here oriented to the evaluation of human likeness. The measure is tested with three human-inspired humanoid posture control systems, the independent channel (IC), the disturbance identification and compensation (DEC), and the eigenmovement (EM) control. The potential and the limitations connected with such human-inspired humanoid control mechanisms are then discussed.
Download

Paper Nr: 11
Title:

An Unmanned Aerial Carrier and Anchoring Mechanism for Transporting Companion UAVs

Authors:

Yiyong Gou, Lucas Dahl, Jan Krüger, Cavid Karca, Dean Boonen and Rico Möckel

Abstract: This paper demonstrates an unmanned aerial carrier as well as a new anchoring mechanism for connecting and transporting companion unmanned aerial vehicles (UAVs). Establishing this platform presents unique challenges including the requirements of precise localization of the platform, real-time environment mapping system, robust flight control approach, docking safety mechanism, and reliable anchor system for the companion UAV. To obtain the positioning information, a tightly-coupled visual-inertial optimization based odometry is implemented with a fisheye camera and an inertial measurement unit. A 3D map is updated in real-time using an Octomap framework. A nonlinear position model predictive controller cascaded with a DJI attitude controller is implemented for the flight control. Innovatively, we designed a lightweight anchoring mechanism for safe landing and reliable transportation of the companion UAV. Real-world experiments results suggest that the transportation system is a viable approach to transport the companion UAV, and that the proposed anchoring mechanism allows for reliable operation.
Download

Paper Nr: 12
Title:

A Comparative Study of Ego-centric and Cooperative Perception for Lane Change Prediction in Highway Driving Scenarios

Authors:

Sajjad Mozaffari, Eduardo Arnold, Mehrdad Dianati and Saber Fallah

Abstract: Prediction of the manoeuvres of other vehicles can significantly improve the safety of automated driving systems. A manoeuvre prediction algorithm estimates the likelihood of a vehicle’s next manoeuvre using the motion history of the vehicle and its surrounding traffic. Several existing studies assume full observability of the surrounding traffic by utilising trajectory datasets collected by top-down view infrastructure cameras. However, in practice, automated vehicles observe the driving environment using egocentric perception sensors (i.e., onboard lidar or camera) which have limited sensing range and are subject to occlusions. This study firstly analyses the impact of these limitations on the performance of lane change prediction. To overcome these limitations, automated vehicles can cooperate in observing the environment by sharing their perception data through V2V communication. While it is intuitively expected that cooperation among vehicles can improve environment perception by individual vehicles, the other contribution of this work is to quantify the potential impacts of cooperation. To this end, we propose two perception models used to generate egocentric and cooperative perception dataset variants from a set of uniform scenarios in a benchmark dataset. This study can help system designers weigh the costs and benefits of alternative perception solutions for lane change prediction.
Download

Paper Nr: 20
Title:

Implicitly using Human Skeleton in Self-supervised Learning: Influence on Spatio-temporal Puzzle Solving and on Video Action Recognition

Authors:

Mathieu Riand, Laurent Dollé and Patrick Le Callet

Abstract: In this paper we studied the influence of adding skeleton data on top of human actions videos when performing self-supervised learning and action recognition. We show that adding this information without additional constraints actually hurts the accuracy of the network; we argue that the added skeleton is not considered by the network and seen as a noise masking part of the natural image. We bring first results on puzzle solving and video action recognition to support this hypothesis.
Download

Paper Nr: 22
Title:

Touch Detection with Low-cost Visual-based Sensor

Authors:

Julio Castaño-Amoros, Pablo Gil and Santiago Puente

Abstract: Robotic manipulation continues being an unsolved problem. It involves many complex aspects, for example, perception tactile of different objects and materials, grasping control to plan the robotic hand pose, etc. Most of previous works on this topic used expensive sensors. This fact makes difficult the application in the industry. In this work, we propose a grip detection system using a low-cost visual-based tactile sensor known as DIGIT, mounted on a ROBOTIQ gripper 2F-140. We proved that a Deep Convolutional Network is able to detect contact or no contact. Capturing almost 12000 images with contact and no contact from different objects, we achieve 99% accuracy with never seen samples, in the best scenario. As a result, this system will allow us to implement a grasping controller for the gripper.
Download

Paper Nr: 27
Title:

Segmentation of Fish in Realistic Underwater Scenes using Lightweight Deep Learning Models

Authors:

Gordon Böer, Rajesh Veeramalli and Hauke Schramm

Abstract: The semantic segmentation of fish in real underwater scenes is a challenging task and an important prerequisite for various processing steps. With a good segmentation result, it becomes possible to automatically extract the fish contour and derive morphological features, both of which can be used for species identification and fish biomass assessment. In this work, two deep learning models, DeepLabV3 and PSPNet, are investigated for their applicability to fish segmentation for a fish stock monitoring application with low light cameras. By pruning these networks and employing a different encoder, they become more suitable for systems with limited hardware, such as remotely operated or autonomously operated underwater vehicles. Both segmentation models are trained and evaluated on a novel dataset of underwater images showing Gadus morhua in its natural behavior. On a challenging test set, which includes fish recorded at difficult visibility conditions, the PSPNet performs best, and achieves an average pixel accuracy of 96.8% and an intersection-over-union between the predicted and the target mask of 73.8%. It achieves this with a very limited parameter set of 94,393 trainable parameters.
Download

Paper Nr: 30
Title:

Path Following with Deep Reinforcement Learning for Autonomous Cars

Authors:

Khaled Alomari, Ricardo C. Mendoza, Daniel Goehring and Raúl Rojas

Abstract: Path-following for autonomous vehicles is a challenging task. Choosing the appropriate controller to apply typical linear/nonlinear control theory methods demands intensive investigation on the dynamics and kinematics of the system. Furthermore, the non-linearity of the system’s dynamics, the complication of its analytical description, disturbances, and the influence of sensor noise, raise the need for adaptive control methods for reaching optimal performance. In the context of this paper, a Deep Reinforcement Learning (DRL) approach with Deep Deterministic Policy Gradient (DDPG) is employed for path tracking of an autonomous model vehicle. The RL agent is trained in a 3D simulation environment. It interacts with the unknown environment and accumulates experiences to update the Deep Neural Network. The algorithm learns a policy (sequence of control actions) that solves the designed optimization objective. The agent is trained to calculate heading angles to follow a path with minimal cross-track error. In the final evaluation, to prove the trained policy’s dynamic, we analyzed the learned steering policy strength to respond to more extensive and smaller steering values with keeping the cross-track error as small as possible. In conclusion, the agent could drive around the track for several loops without exceeding the maximum tolerated deviation, moreover, with reasonable orientation error.
Download

Paper Nr: 36
Title:

Augmented Reality and Affective Computing on the Edge Makes Social Robots Better Companions for Older Adults

Authors:

Taif Anjum, Steven Lawrence and Amir Shabani

Abstract: The global aging population is increasing rapidly along with the demand for care that is restricted by the decreasing workforce. World Health Organization (WHO) suggests the development of smart, physical, social, and age-friendly environments will improve the quality of life for older adults. Social Companion Robots (SCRs) integrated with different sensing technologies such as vision, voice, and haptic that can communicate with other smart devices in the environment can allow for the development of advanced AI solutions towards an age-friendly, assistive smart space. Such robots require the ability to recognize and respond to human affect. This can be achieved through applications of affective computing such as emotion recognition through speech and vision. Performing such smart sensing using state-of-the-art technologies (i.e., Deep Learning) at the edge can be challenging for mobile robots due to limited computational power. We propose to address this challenge by off-loading the Deep Learning inference to edge hardware accelerators which can minimize the network latency and privacy/cybersecurity concerns of alternative cloud-based options. Additionally, to deploy SCRs in care-home facilities we require a platform for remote supervision, assistance, communication, and technical support. We propose the use of Augmented Reality (AR) smart glasses to establish such a central platform that will allow one single caregiver to assist multiple older adults remotely.
Download

Paper Nr: 9
Title:

Evaluation of an Artificial Potential Field Method in Collision-free Path Planning for a Robot Manipulator

Authors:

M. Elahres, A. Fonte and G. Poisson

Abstract: Path planning with obstacle avoidance has been a major challenge in robotic manipulators which are composed of multiple links especially in the case of complex-shaped obstacles. This paper proposes an improved collision-free path planning algorithm based on the Artificial Potential Field (APF) method to obtain a collision-free path from initial to a desired position and orientation. Firstly, the robot is modelled by the Denavit-Hartenberg DH parameter method. Secondly, the artificial attractive and repulsive force field equations are derived in the case of both spherical and hollow cylindrical obstacles. Then, a poly-articulated cylindrical model for the robot is used for collision detection between all its links and the obstacle. Finally, a virtual torque is generated based on the forces affecting the robot links to produce a suitable motion to approach the final target without collision with the obstacle. The algorithm is evaluated by building a simulation platform using MATLAB R2020b and Robotic Toolbox. Various simulations on the UR5 robot show that the proposed algorithm can plan a free-collision path in the 6D operational space. The simulations also show that the algorithm has a low computational cost, so it can be used for real-time applications.
Download

Paper Nr: 24
Title:

Honeycomb Layout Inspired Motion of Robots for Topological Mapping

Authors:

Raul F. Santana and George P. Thé

Abstract: Swarm robots are an important area of mobile robotics inspired by the collective behavior of animals for the execution of activities and, using this premise, the present work aims to present a swarm robotics algorithm for topological mapping of environments. A comparison was made with another article in the literature that presents a similar technique, where it was possible to see an improvement in performance in the proposed parameters. An assessment of closed-loop control was also carried out, using a proposed evaluation metric, which showed a minor impact with the increase in the number of agents, and in smaller environments.
Download

Paper Nr: 31
Title:

Partitioned Reconstruction of Contact Forces in Tactile Sensor Arrays for Robotic Sensing Systems

Authors:

María-Luisa Pinto-Salamanca and Wilson-Javier Pérez-Holguín

Abstract: The reconstruction of contact forces is essential for the performance of robotic manipulation systems from the information captured by tactile sensors. This work explores the implementation of a model-driven approach for the triaxial reconstruction of contact forces in tactile sensor arrays using a partition algorithm that estimates forces in smaller subarrays on a flat and rigid surface. The validation of the presented approach depends on a prior verification of compliance with the centroids of traction and compression for each analysed subarray. Considering the force estimation errors, the proposed approach shows a better behaviour than similar works for single contacts in the force reconstruction for multiple contact events and when using large size sensors arrays. In addition, the application of the partitioning approach demonstrates a significant decrease in response time by reducing the number of operations that are needed for the force reconstruction calculation. Although the relative errors are still significant, the results obtained allow verifying a clear contribution to the reconstruction of contact events under processing time restrictions for sensor arrays ranging from small to large scale, that favors the development of electronic skin in robotic applications.
Download