About Papers Workshops Students Code
Ronald Clark
Ronnie Clark
Research Fellow
Department of Computing
Imperial College London
Email: ronald.clark{at}imperial.ac.uk; ron.clark{at}live.com; Twitter,Google Scholar

About me

I am a postdoctoral fellow at Imperial College London where I hold a Dyson Fellowship. I obtained my PhD from the University of Oxford Department of Computer Science.

My research centers around 3D machine perception which is needed to enable mobile devices to model, explore and understand their surroundings. I am particularly interested in ways in which deep learning can be used alongside traditional mathematical and geometrical models to unlock a new level of performance in spatial machine perception. My current work focusses on how recurrent, convolutional neural networks can be used to create consistent, dense, semantically annotated reconstructions of the world. I also have a keen interest in computer graphics and animation.

In the past I have worked on machine learning for natural user interaction [8,9] and optimal systems design [7]. I got my MSc degree in Information Engineering from the University of Witwatersrand, South Africa, at the Centre for Systems and Control in 2014.  

New Updates

  • Dec, 2018: Invited Talk at Microsoft Research Cambridge.
  • Dec, 2018: Co-organizing the CVPR'19 Workshop on Deep Learning for Visual Semantic Navigation
  • Aug, 2018: Invited Talk at the London Machine Learning Meetup.
  • July, 2018: New papers accepted at 3DV, ECCV, BMVC and IEEE TMC.
  • June, 2018: CVPR Best Paper Honourable Mention award!
  • May, 2018: Joining the BMVC technical programme committee.
  • Feb, 2018: One paper accepted at CVPR 2018.
  • Feb, 2018: Excited to be supporting OxFEST at their innovation panel.
  • Jan, 2018: Organizing the 1st Workshop on Deep Learning for Visual SLAM at CVPR'18.


optimization-robotics-computer vision
Fusion++: Volumetric Object-Level SLAM
John McCormac*, Ronald Clark*, Michael Bloesch, Stefan Leutenegger, Andrew J. Davison
We propose an online object-level SLAM system which builds a persistent and accurate 3D graph map of arbitrary reconstructed objects. Reconstructed objects are stored in an optimisable 6DoF pose graph which is our only persistent map representation. Each object also carries semantic information which is refined over time and an existence probability to account for spurious instance predictions.
International Conference on 3D Vision (3DV'18) [pdf, video] *equal contribution
optimization-robotics-computer vision
Learning to Solve Non-Linear Least-Squares for Monocular Stereo
Ronald Clark, Michael Bloesch,Jan Czarnowski, Stefan Leutenegger, Andrew J. Davison
Sum-of-squares objective functions are very popular in computer vision algorithms. However, these objective functions are not always easy to optimize. We propose LS-Net, a neural nonlinear least squares optimization algorithm which learns to effectively optimize these cost functions even in the presence of adversities. The proposed solver requires no hand-crafted regularizers or priors as these are implicitly learned from the data.
European Conference on Computer Vision (ECCV'18) [pdf, bibtex]
optimization-robotics-computer vision
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset
Wenbin Li, Sajad Saeedi, John McCormac, Ronald Clark, Dimos Tzoumanikas, Qing Ye, Yuzhong Huang, Rui Tang, Stefan Leutenegger
Synthetic imagery bears a vast potential due to scalability in terms of amounts of data obtainable without tedious manual ground truth annotations or measurements. We present a dataset with the aim of providing a higher degree of photo-realism, larger scale, more variability as well as serving a wider range of purposes compared to existing datasets.
British Machine Vision Conference (BMVC'18) [pdf,project]
optimization-robotics-computer vision
CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM
Michael Bloesch,Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison
The representation of geometry in real-time 3D perception systems continues to be a critical research issue. We present a new compact but dense representation of scene geometry which is conditioned on the intensity data from a single image and generated from a code consisting of a small number of parameters.We explain how to learn our code representation, and demonstrate its advantageous properties in monocular SLAM.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'18) [pdf,video]
optimization-robotics-computer vision
Meta Learning for Instance-Level Data Association
Ronald Clark, John McCormac, Stefan Leutenegger, Andrew J. Davison
In this work we introduce a meta-leaning model for segmenting objects in a class agnostic manner, that allows us to match and track novel objects of interest across multiple frames of video without specific instance models being known a priori.
Neural Information Processing Systems (NIPS'17) Workshop on MetaLearning [pdf,bibtex]
optimization-robotics-computer vision
Efficient Indoor Positioning with Visual Experiences via Lifelong Learning
Hongkai Wen, Ronald Clark, Sen Wang, Xiaoxuan Lu, Bowen Du, Wen Hu, Niki Trigoni
Positioning with visual sensors in indoor environments has many advantages. This paper proposes a novel lifelong learning approach to enable efficient and real-time visual positioning. We explore the fact that when following a previous visual experience for multiple times, one could gradually discover clues on how to traverse it with much less effort.
IEEE Transactions on Mobile Computing (TMC) [pdf,bibtex]
optimization-robotics-computer vision
3D Object Reconstruction from a Single Depth View with Adversarial Learning
Bo Yang, Hongkai Wen, Sen Wang, Ronald Clark, Andrew Markham, Niki Trigoni
In this paper, we propose a novel 3D-RecGAN approach, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks. The proposed 3D-RecGAN only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid by filling in the occluded/missing regions.
IEEE International Conference on Computer Vision Workshops (ICCV-W) [pdf,bibtex]
optimization-robotics-computer vision
A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization
Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, Niki Trigoni
Convolutional neural networks (CNN) and regression forests, have recently shown great promise in performing 6-DoF localization of monocular images. However, in most cases image-sequences, rather only single images, are readily available. In this paper we propose a recurrent model for performing 6-DoF localization of video-clips. We find that, even by considering only short sequences (20 frames), the pose estimates are smoothed and the localization error can be drastically reduced.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'17) [pdf,bibtex]
optimization-robotics-computer vision
VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem
Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, Niki Trigoni
In this paper we present an on-manifold sequence-to-sequence learning approach to motion estimation using visual and inertial sensors. It is to the best of our knowledge the first end-to-end trainable method for visual-inertial odometry which performs fusion of the data at an intermediate feature-representation level.
The 31st AAAI Conference on Artificial Intelligence (AAAI) [Project,pdf,bibtex]
optimization-robotics-computer vision
End-to-End, Sequence-to-Sequence Probabilistic Visual Odometry through Deep Neural Networks
Sen Wang, Ronald Clark, Hongkai Wen, Niki Trigoni
In this paper, we investigate whether deep neural networks can be effective and beneficial to the VO problem. An end-to-end, sequence-to-sequence probabilistic visual odometry (ESP-VO) framework is proposed for the monocular VO based on deep recurrent convolutional neural networks. Uncertainty is also derived along with the VO estimation without introducing much extra computation.
International Journal of Robotics Research (IJRR). [pdf]
optimization-robotics-computer vision
DeepVO: Towards End-to-End Visual Odometry with Recurrent Convolutional Networks
Sen Wang, Ronald Clark, Hongkai Wen, Niki Trigoni
IEEE International Conference on Robotics and Automation (ICRA), 2017. [pdf]
optimization-robotics-computer vision
Large Scale Indoor Keyframe-based Localization using Geomagnetic Field and Motion Pattern
Sen Wang, Hongkai Wen, Ronald Clark, Andrew Markham, Niki Trigoni
This paper studies indoor localisation problem by using low-cost and pervasive sensors. We present a novel keyframe based Pose Graph Simultaneous Localisation and Mapping (SLAM) method, which correlates ambient geomagnetic field with motion pattern and employs low-cost sensors commonly equipped in mobile devices.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016. [pdf,bibtex]
optimization-robotics-computer vision
Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data
Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, Niki Trigoni
Localization is a key requirement for mobile robot autonomy and human-robot interaction. We propose to inter-weave a visual map with rich sensory data. This multi-modal approach achieves two key goals simultaneously. First, it enables us to harness additional sensory data to localise against a map covering a vast area in real-time; and secondly, it also allows us to roughly localise devices which are not equipped with a camera.
IEEE-RAS International Conference on Humanoid Robots (Humanoids), 2016. [pdf,bibtex]
optimization-robotics-computer vision
Pushing the Limits of Indoor 3D Modelling using Mobile Sensing
Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, Niki Trigoni
submitted to IEEE Trans. on Mobile Computing (TMC), 2016. [pdf]
optimization-robotics-computer vision
Robust Vision-based Indoor Localization
Ronald Clark, Andrew Markham, Niki Trigoni
In this work we highlight some of the issues that arise when using vision-based methods for indoor localization. We then propose means of addressing these issues and implement a proof-of-concept visual inertial odometry system for a mobile device.
IEEE International Conference on Information Processing in Sensor Networks (IPSN), 2015. [pdf,bibtex]
Optimization of a Hybrid Energy System through Fast Convex Programming
Ronald Clark,W.A. Cronje, Anton van Wyk
In this paper, a methodology for the optimal economic design of a hybrid energy system is presented. The novelty of the proposed methodology lies in the fact that a convex optimization approach is used - allowing a solution to be found very efficiently compared to existing techniques. The results agree with those obtained using the closed-source HOMER software package which is based on slow discrete combinatorial optimization.
IEEE International Conference on Intelligent Systems Modelling and Simulation (ISMS), 2014. (Top 15% of accepted papers) [pdf,bibtex]
optimization-robotics-computer vision
System for the Recognition of Handwritten Mathematical Expressions
Ronald Clark,Quik Kung, Anton van Wyk
Most communication involving mathematical expressions is carried out over computer systems. Despite this fact, the entry of mathematical expressions for computer processing remains highly time-consuming and unintuitive. In this report the design and implementation of an automatic mathematical expression recognition system is presented with the goal of streamlining the human-computer-interface for entering mathematical expressions.
IEEE International Conference on Computer as a Tool (EUROCON), 2013. (IEEE Best Student Paper Contest Nominee) [pdf,bibtex]
Recognising Handwritten Mathematical expressions using an Ensemble of SVM Classifiers
Ronald Clark,Quik Kung, Anton van Wyk
Support Vector Machines (SVM) have proven to be highly accurate in classifying handwritten mathematical symbols - especially when a diverse range of features is used. This paper investigates the classification of handwritten mathematical symbols using an SVM method and an ensemble of three different feature sets in order to minimise the number of training samples required and achieve accurate classification rates.
13th Symposium of the Pattern Recognition Association of South Africa (PRASA/IAPR), 2012. [pdf,bibtex]

Workshops organized

machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Deep Learning for Visual SLAM @ CVPR
Co-organizers: Sudeep Pillai, Alex Kendall, Andrew Davison
This workshop will focus on the intersection of deep learning and real-time visual SLAM. The workshop will explore ways in which data-driven models can be harnessed for creating robust Visual SLAM algorithms which are less fragile and much more robust than existing state-of-the-art approaches. The workshop will also investigate ways in which we can use deep-learned models alongside traditional approaches in a unified and synergistic fashion.
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Workshop on Datasets and Benchmarking for Robotics @ ICRA
Co-organizers: Sajad Saeedi
More info soon...
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Deep Learning for Visual Semantic Navigation @ CVPR
Co-organizers: Alexander Toshev, Anelia Angelova, Niko Sunderhauf
The problem of navigation, the ability of an autonomous agent to find its way in a large, visually complex environment, is a fundamental problem in computer vision and robotics. Over the last couple of years we have seen a body of work which tries to marry perception and planning for navigation problems, capitalizing on recent development in Deep and Reinforcement Learning. This work has happened mostly in machine learning conferences, and as of now has bypassed major vision venues.As such, robot navigation is an area where advances not only in representation and SLAM come together, but also in novel learning models and techniques, visual recognition, language and vision, etc. In this workshop we propose to give a forum to such ideas, many of which are quite promising but can benefit from cross-disciplines discussions.

Current and past students

machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Arsalan Zafar (MSc)
Date: 2017-2018
Project Title: Machine Learning for Automatic Camera Calibration

In this project we propose a novel learning based approach for camera calibration that inherits the flexibility, robustness and ease of use of single image approaches, but is able to benefit from the accuracy of multiple images. Amongst multi-image approaches, our method is unique in that it eliminates the need for establishing correspondences and therefore does not require overlapping viewpoints.
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Qiulin Wang (MSc)
Date: 2018-2019
Project Title: Using Generative Models for Volumetric Object-level SLAM

In this project we are interested in using deep generative models (eg. VAE, GANs) to learn a flexible latent space of 3D objects to be used in the reconstruction process. This will allow us to capture priors about the shape of natural objects while still remaining flexible enough to reconstruct novel objects which may not have been seen apriori.
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Arthur Wilcke (MSc)
Date: 2018-2019
Project Title: Investigating Uncertainty in Deep Learning for Computer Vision Tasks

Uncertainty, or knowing how much a model doesn't know, is an important component of machine learning systems for reasons ranging from safety to sensor fusion. In this project we investigate the existing approaches to modelling uncertainty in deep learning, evaluate them on a number of computer vision tasks.
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Benjamin Obikoya (MSc)
Date: 2018-2019
Project Title: Photorealistic rendering of 3D Volumetric Reconstructions

Rendering realistic novel views from a TSDF representation of a scene is challenging as their scalability limits the photometric detail that can be stored in the volume. In this project we are interested in rendering photorealistic images from TSDFs and associated RGB images.
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Jinsung Ha (MSc)
Date: 2018-2019
Project Title: Self-supervised learning for depth image completion and enhancement

A significant issue encountered when using the depth data in perception tasks is that the depth images are often incomplete and in many cases can have >50% of the pixels missing due to occlusions, specular surfaces, quantization errors and noise. The goal of this project is to use deep learning to enhance the depth data so that it can be used to its full benefit in robotic perception tasks.


optimization-robotics-computer vision
Visual-Inertial Odometry, Mapping and Relocalization through Learning
Ronald Clark
PhD Thesis (DPhil), University of Oxford (June 2017) [pdf,bibtex]


machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Conditional DCGAN in Keras
Github, 2015
Implementation of Conditional Generative Adversarial Nets
Based on: https://arxiv.org/abs/1411.1784
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
Github, 2017
Implementation of single image 3D reconstruction using Generative Adversarial Nets.
Based on: https://arxiv.org/abs/1708.07969
machine learning, deep learning, svr, slam, 3D reconstruction, artificial intelligence
MATLAB Support Vector Regression code
Ronald Clark
MATLAB FIle Exchange, 2013 [link]. A modiefied version of the code has been

Journal and Conference Review Service

Pattern Recognition Letters


Computer Animation , University of Oxford, (2014,2015) [course page]
- transformation chains, scene description languages, time-varying transformations, interpolation functions, collision detection, physical response models

ELEN4017 Network Fundamentals , University of Witwatersrand [course page]
- basic routing protocols, dynamic name servers, address resolution, network stack, worl-wide web

Original Theme by: Weilin Huang