2019 Intelligent Sensing Summer School

A five-day event on advanced intelligent sensing and AI topics, including computer vision, machine listening, natural language processing and tactile sensing. Attendees will learn the most recent methodologies and application for each of the themes. Posters showcasing projects at the Centre for Intelligent Sensing will also be presented during the school. A hands-on activity (the CORSMAL Challenge) will see participants divided into groups that will compete to solve an assigned task within a limited time span. Solutions will be presented in front of a judging panel that will vote the best groups.

The target audience are researchers from the industry, Postdocs, and MSc & PhD students. QMUL PhD students will receive Skills Points for participating, presenting or helping.

Registration: now closed! Send an [email] for late registrations.

Registration for QMUL students and staff (free but mandatory): send an email to [email] by mentioning (i) full name, (ii) supervisor or line manager and (iii) three keywords defining your research interests.

Where: FB 1.13a, Bancroft building (number 31 in [map]).

Accommodations nearby the event location: [QMUL campus], [short stay], [hotels].

For any query: [email]

Programme - 2-6 September

Programme at a glance

Monday Tuesday Wednesday Thursday Friday
Morning

Tactile Sensing & AI
- vision and tactile sensing
- force, pressure and touch
- in-hand manipulation
- self-calibration


The CORSMAL Challenge
(working in groups)


The CORSMAL Challenge
(working in groups)


The CORSMAL Challenge
(working in groups)
Afternoon Welcome and opening

Natural Language Processing & AI
- conversational agents
- news filtering
- gamifying crowdsourcing
- multimodal data


Sound Sensing & AI
- sound event detection
- situational awareness
- speaker localization
- speech recognition


The CORSMAL Challenge
(working in groups)


The CORSMAL Challenge
(presentation of the results)

Vision Sensing & AI
- video enhancement
- image categorisation and segmentation
- 3D sensing
Closing and awards
List of posters displayed during the summer school: [here]


Detailed programme

2 September
13:00-14:00 Registration
14:00-14:10 Welcome to CIS and Opening of the Summer School
Vision Sensing & AI
14:10-14:20 Opening by the session chair Qianni Zhang
Qianni Zhang
14:20-15:00
Use of machine learning in video enhancement

The BBC is well-known for its stunning visual content and rich archives. We are researching how signal processing and machine learning techniques can add even more value to that content: from enriching both pixels and semantics of existing videos, to delivering them more efficiently to our audiences. This talk will include recent trends and our contributions to this topic. For example, these include the results include design of algorithms for reduction of video encoder complexity, colourisation of visual data (still images for now), interpretability of deep neural networks and super resolution.

[slides] [video]
Marta Mrak
Marta Mrak
BBC
15:00-15:30 Coffee break
15:30-15:40
Unsupervised deep learning by neighbourhood discovery

Deep convolutional neural networks (CNNs) have demonstrated remarkable success in computer vision by supervisedly learning strong visual feature representations. In this talk, we introduce a generic unsupervised deep learning approach to training deep models without the need for any manual label supervision. Specifically, we progressively discover sample anchored/centred neighbourhoods to reason and learn the underlying class decision boundaries iteratively and accumulatively. Experiments on image classification show the performance advantages of the proposed method over the state-of-the-art unsupervised learning models on six benchmarks including both coarse-grained and fine-grained object image categorisation.

[slides] [video]
Jiabo Huang
Jiabo Huang
15:40-15:50
3D sensing and analysis for high-quality point clouds

Multi-view 3D reconstruction techniques enable digital reconstruction of 3D objects from the real world by fusing different viewpoints of the same object into a single 3D representation. This process is by no means trivial and the acquisition of high quality point cloud representations of dynamic scenes is still an open problem. Addressing the increasing demand placed on real-time reconstruction, the work proposes the use of low-cost 3D studio environment which enables photo-realistic reconstruction of human avatars while eliminating the background. The proposed approach exploits an efficient composition of several inpainting methods and filtering methods with ability to search in local neighborhood and share mutual depth data between adjacent sensors to create single point cloud representation in real-time by 3D data fusion of multiple RGB-D sensors.

[slides] [video]
Andrej Satnik
Andrej Satnik
15:50-16:00
AI for digital pathology image analysis

Histopathology imaging is a type of microscopy imaging commonly used for the microlevel clinical examination of a patient's pathology. Due to the extremely large size of histopathology images, especially whole slide images (WSIs), it is difficult for pathologists to make a quantitative assessment by inspecting the details of a WSI. Hence, a computer-aided system is necessary to provide a subjective and consistent assessment of the WSI for personalised treatment decisions. In this presentatipn, deep learning frameworks for the automatic analysis of whole slide histopathology images is presented for the first time, which aims to address the challenging task of assessing and grading colorectal liver metastasis (CRLM). Quantitative evaluations of a patient’s condition with CRLM are conducted through quantifying different tissue components in resected tumorous specimens. This study mimics the visual examination process of human experts, by focusing on three levels of information, the tissue level, cell level and pixel level, to achieve the step by step segmentation of histopathology images.

[video]
Zhaoyang Xu
Zhaoyang Xu
16:00-16:30 Presentation of the CORSMAL Challenge Ricardo Sanchez-Matilla
Ricardo Sanchez-Matilla
16:30-17:00 Self-presentation of participants to the summer school and presentation of list of [posters]


3 September
9:00-9:30 Registration
Tactile Sensing & AI
9:30-9:40 Welcome and opening by the session chair Lorenzo Jamone
Lorenzo Jamone
9:40-10:10
Multimodal and cross-modal robotic perception with vision and tactile sensing

Future robots, as embodied agents, should make best use of all available sensing modalities to interact with the environment. In this talk, research on combining vision and touch sensing from the perspective of how touch sensing complements vision to achieve a better robot perception will be introduced.

[slides] [video]
Shan Luo
Shan Luo
University of Liverpool
10:10-11:00
Applications of tactile sensing in industry

Giving robots a sense of touch is vital to making them perform tasks that currently only humans can do. In this talk, Rich will explain current challenges and opportunities - what works and what doesn't - and give some insights into future research challenges.

[slides] [video]
Rich Walker
Rich Walker
Shadow Robot
11:00-11:30 Coffee break with interactive Demos
11:30-12:00
One sensor to measure two modalities – force information and tactile information

In this research, we present a novel design for an elastomer-based tactile and force sensing device that senses both information within one elastomer. The proposed sensor has a soft and compliant design employing an opaque elastomer. The optical sensing method is used to measure both information simultaneously based on the deformation of the reflective elastomer structure and a flexure structure.

Wanlin Li
Wanlin Li

Learning robotic in-hand manipulation tasks from demonstration

In-hand manipulation requires handling two problems simultaneously: controlling the object trajectory and keeping the object in a stable grasp. Multiple fingers should move in coordination while keeping a robust contact with the object. We combine learning from demonstration and the virtual spring framework to answer both of these problems. We use the tactile force sensing to adapt the grasp forces as a reaction to trajectory control forces.

[slides] [video]
Gokhan Solak
Gokhan Solak

Robot self-calibration from touch events

Robots often rely on a numerical representation of their body in order to interact with the environment; notably, such model needs to be continuously calibrated, for example, through some form of parameter estimation, to cope with changes over time. We will present a novel online strategy that allows a robot to self-calibrate its model by touching planar surfaces in its environment. We achieve this using an adaptive parameter estimation (Extended Kalman Filter) which incorporates planar constraints obtained at each contact detection. Testing this method on simulated and real-world robotic setups, we conclude to be able to improve significantly the robotic accuracy towards future reaching/grasping tasks.

[slides] [video]
Rodrigo Zenha
Rodrigo Zenha

Smart arse: textile pressure sensing in trousers

Textiles are a material we are very familiar with and that serves as an interface to the world. In the form of clothes, textiles follow our movements and can therefore be explored as an unintrusive modality for body-centric computing. Here, we introduce sensing trousers to classify sitting postures and furthermore social behaviours using embedded fabric pressure sensors.

[slides] [video]
Unknown
Sophie Skach
12:00-12:30
Robots with sense of touch

Robots operating in dynamic and unstructured environments must exhibit advanced forms of interaction with objects and humans. 'Sense of Touch' in robots can play a fundamental role in enhancing perceptual, cognitive and operative capabilities of robots, specifically when they physically interact with objects and humans in the environment. Many solutions to design, engineer and manufacture tactile sensors have been presented, because the availability of appropriate sensing technologies is the first and necessary step, but the effective utilization of sense of touch in robots depends also on the understanding of tactile perception mechanism through which the robot builds an appropriate world model. The lecture will present technological and research challenges for providing robots with sense of touch.

[slides] [video]
Perla Maiolino
Perla Maiolino
University of Oxford
13:00-13:50 Registration
Natural Language Processing & AI
13:50-14:00 Welcome and opening by the session chair Massimo Poesio
Massimo Poesio
14:00-14:45
Conversational agents in games

Games present both a challenge and an opportunity for conversation modelling. Ed speaks about the combination of AI techniques and design strategies that allow designers to build AI characters to respond to a wide range of player input (including natural language, gestures, and in-game actions), while staying on track to deliver the desired story and play experience.

[video]
Edward Minnett
Edward Minnett
Spirit AI
14:45-15:30
Applying AI in news filtering

In Signal, we use A.I. to monitor and analyse news and media. We focus on reputation management and market intelligence analysis. Quantity, speed, customisation and user interaction are fundamental requirements. In this talk, we will explain how A.I. techniques can be applied on vast amount of text data. We will look into particular examples of entity processing, which require the derivation of knowledge about arbitrary named entities from the data. We will introduce state-of-the-art technology in research. Particularly we will highlight how the appropriate use of data can help achieving robust performance in commercial standard, and discuss some core challenges in applying A.I. on vast scale in an industrial setting.

[slides] [video]
Raymond Ng
Raymond Ng
Signal AI
15:30-16:00 Coffee break
16:00-16:40
Hands-On session: deep learning for Natural Language Processing (NLP)

Deep learning plays an important role in state-of-the-art Natural Language Processing (NLP) applications and is now used in the most recent systems developed for NLP tasks. In this session, we will explore a neural machine translation system using the sequence-to-sequence (seq2seq) model together with the attention mechanism (Sutskever et al., 2014 and Cho et al., 2014). The model we explore during this session is similar to the Google’s Neural Machine Translation system (GNMT) (Wu et al., 2016), and has the potential to achieve competitive performance with the GNMT by using larger and deeper networks.

[slides] [video]
Juntao Yu
Juntao Yu
16:40-17:05
Gamifying crowdsourcing

Crowdsourcing was historically applied to simple labelling tasks, like picking objects in pictures. However, the tasks we use supervised learning for, and require human computation for, are getting increasingly more complex. Harnessing power from the crowd now requires the design and integration of bespoke interfaces, automated pipelines, training, task assignment, aggregation and various other measures to take non-experts and turn them into a workforce that can match expert workers. This talk will look at how we used gamification in crowdsourcing to address these problems when annotating candidate mentions for coreference resolution.

[video]
Chris Madge
Chris Madge
17:05-17:30
Language and Vision tasks: models and what they learn

In the literature, several tasks have been proposed to combine linguistic and visual information. Different models have been developed to solve these tasks. These models implement the bottom-up processing of the "Hub and Spoke" architecture proposed in cognitive science to represent how the brain processes and combines multi-sensory inputs. In particular, the Hub is implemented as a neural network encoder. This talk will provide an overview of these tasks and models. And will show that the linguistic skills of the models differ dramatically, despite models having a comparable task success rate. In the later part of the talk will focus on how to systematically investigate the effect on the encoder of various vision-and-language tasks.

[slides] [video]
Ravi Shekhar
Ravi Shekhar


4 September
The CORSMAL Challenge
9:00-9:30 Start of the CORSMAL Challenge
10:00-11:30 Working in groups
12:00-12:50 Registration
Sound Sensing & AI
12:50-13:00 Welcome and opening by the session chair Lin Wang
Lin Wang
13:00-13:45
The challenges and benefits of sound sensing

In a context where advances in AI have successfully turned the intelligent sensing of images, music, speech, health and people’s identity into practical and commercial realities, sound sensing in a wider sense than speech and music has only recently started to break through as a missing piece of the perceptual AI puzzle. This talk will illustrate the various aspects of the journey taken by Audio Analytic to develop pioneering sound event detection technology from scratch and turn it into commercial products yielding benefits to millions of customers. Topics will cover the research challenges, data challenges and privacy questions which underlie the intelligent sensing of acoustic scenes and events.

[slides] [video]
Sacha Krstulovic
Sacha Krstulovic
Audio Analytic
13:45-14:30 Localize, track, and interact: machine listening for AI

Audio signals encapsulate vital information required for autonomous agents to make sense of their surrounding environment. This talk focuses on state-of-the-art approaches in machine listening that equip autonomous agents with situational awareness. The first part of the talk will provide an overview of existing approaches for localization and tracking of sound sources. The second part will focus on practical insights gained from the recent LOCATA Challenge. In the third part, we will explore current and future directions, such as the self-localization of moving microphone arrays using acoustic SLAM, and the fusion of data from acoustic sensor networks in smart environments.
Christine Evers
Christine Evers
Imperial College London
14:30-14:45
Speaker localization and tracking using multi-modal signals

The talk focuses on exploiting the complementarity of the audio and video modalities to accurately estimating the trajectories of the targets under challenging scenarios, such as partial occlusions and environment noise. We propose the AV3T algorithm which estimates the 3D mouth position from face detections and models the likelihood in the camera's spherical coordinates based on the uncertainties derived from the image-to-3D projection. Moreover, AV3T uses video to indicate the most likely speaker-height plane for the acoustic map computation. During misdetections, it switches to a generative model based on color spatiograms. We will aslo present a newly collected audio-visual dataset with annotations.

[slides] [video]
Xinyuan Qian
Xinyuan Qian
14:45-15:15 Coffee break
15:15-16:00
Distant microphone speech recognition in everyday environments: from CHiME-5 to CHiME-6

The CHiME challenge series has been aiming to advance robust automatic speech recognition technology by promoting research at the interface of speech and language processing, signal processing and machine learning. This talk will present outcomes of the 5th CHiME Challenge, which has considered the task of distant multi-microphone conversational speech recognition in domestic environments. The talk will present an overview of the CHiME-5 dataset, a fully-transcribed audio-video dataset that has captured 50 hours of audio from 20 separate dinner parties held in real homes each with 6 video channels and 32 audio channels. The talk will discuss the design of the light-weight recording set up that allowed for highly natural data to be recorded. I will present an analysis of the data, highlighting the major sources of difficulty it presents for recognition systems. The talk will summarise the outcomes of the challenge itself and recent advances that now present the state-of-the art. The talk will conclude by discussing future directions and introducing the CHiME-6 challenge that is due to launch later this year.

[video]
Jon Barker
Jon Barker
University of Sheffield
16:00-16:35
Embedded sound processing with Bela

This talk will present Bela (http://bela.io), an embedded computing platform for creating ultra-low-latency interactive audio systems. Bela is based on the BeagleBone Black, a 1GHz ARM single-board computer. It combines the performance of the Xenomai real-time Linux environment, flexible connectivity to a wide variety of sensors and actuators, and an easy-to-use browser-based development environment. Bela is a fully open-source platform for makers, musicians and researchers to create highly responsive interactive systems.

[video]
Andrew McPherson
Andrew McPherson
16:35-16:50
Explainable Machine Learning and its applications to Machine Listening

Explainable Machine Learning (EML) algorithms aim to make Deep Neural Networks (DNNs) transparent through their post-hoc analysis. This talk will introduce two key categories of EML algorithms for explaining a model and for explaining individual model predictions. We will cover the recent advances in understanding DNNs and will highlight some of the key research challenges that EML methods have to face. We will conclude with a demonstration of our recent works that explain machine listening models to classify audio.

[slides] [video]
Saumitra Mishra
Saumitra Mishra
16:50-17:30
Panel discussion: Christine Evers, Sacha Krstulovic and Jon Barker. Moderator: Lin Wang

[video]


5 September
The CORSMAL Challenge
9:00-17:30 Working in groups


6 September
The CORSMAL Challenge
9:00-13:00 Working in groups
13:00 Submission of the CORSMAL Challenge results
14:00-15:30 Presentation of the results in front of a panel
15:30 Closing and awards


Posters
Adapting the quality of experience framework for audio archive evaluation [pdf]
A. Ragano, E. Benetos, A. Hines

Analysing the predictions of a CNN-based replay spoofing detection system [pdf]
B. Chettri, S. Mishra, B. L. Sturm, E. Benetos

An elastomer-based flexible optical force and tactile sensor [pdf]
W. Li, J. Konstantinova, Y. Noh, Z. Ma, A. Alomainy, K. Althoefer

Background light estimation for depth-dependent underwater image restoration [pdf]
C.Y. Li, A. Cavallaro

Distributed one-class learning [pdf]
A.S. Shamsabadi, H. Haddadi, A. Cavallaro

Effect of textile properties on a low-profile wearable loop antenna for healthcare applications [pdf]
I.I. Labiano, A. Alomainy, M.M. Bait-Suwailam

End-to-end probabilistic inference for nonstationary audio analysis [pdf]
W. Wilkinson, M.R. Andersen, J.D. Reiss, D. Stowell, A. Solin

Knowledge distillation by on-the-fly native ensemble [pdf]
X. Lan, X. Zhu, S. Gong

Learning action representations for self-supervised visual exploration [pdf]
C. Oh and A. Cavallaro

MORB: A multi-scale binary descriptor [pdf]
A. Xompero, O. Lanz, A. Cavallaro

Multiview 3D sensing and analysis for high quality point cloud capturing and model generation [pdf]
A. Satnik and E. Izquierdo

Real-time quality assessment of videos from body-worn cameras [pdf]
Y.Y. Chang, R. Mazzon, A. Cavallaro

Region based user-generated human body scan registration [pdf]
Z. Xu, Q. Zhang

Scene privacy protection [pdf]
C.Y. Li, A.S. Shamsabadi, R. Sanchez-Matilla, R. Mazzon, A. Cavallaro

Self-referenced deep learning [pdf]
X. Lan, X. Zhu, S. Gong

Sound-based transportation mode recognition with smartphones [pdf]
L. Wang, D. Roggen

Sparse Gaussian process audio source separation using spectrum priors in the time-domain [pdf]
P.A. Alvarado, M.A. Alvarez, D. Stowell

SubSpectralNet – using sub-spectrogram based Convolutional Neural Networks for acoustic scene classification [pdf]
S.S.R. Phaye, E. Benetos, Y. Wang

Tracking a moving sound source from a multi-rotor drone [pdf]
L. Wang, R. Sanchez Matilla, A. Cavallaro

Unifying probabilistic models for time-frequency analysis [pdf]
W. Wilkinson, M.R. Andersen, J.D. Reiss, D. Stowell, A. Solin

Visual localization in the presence of appearance changes using the partial order kernel [pdf]
M. Abdollahyan, S. Cascianelli, E. Bellocchio, G. Costante, T.A. Ciarfuglia, F. Bianconi, F. Smeraldi, M.L. Fravolini



Logistics Filming
Muhammad Farrukh Shahid
Muhammad Farrukh S.

Ashish Alex
Ashish Alex

Vandana Rajan
Vandana Rajan

Xinyuan Qian
Xinyuan Qian

Chau Yi Li
Chau Yi Li

Ali Shahin Shamsabadi
Ali Shahin Shamsabadi

Sponsors
IET logo
CORSMAL logo
EPSRC logo



Past events

Summer schools

2018 Summer School

2017 Summer School

2016 Summer School

2015 Summer School

2014 Summer School

2013 Summer School


Other events

2018/19 CIS PhD Welcome day

2017/18 CIS PhD Welcome day

2016/17 CIS PhD Welcome day

2015/16 CIS PhD Welcome day

CIS Spring Camp 2016

Sensing and graphs week

Commercialisation bootcamp

Sensing and IoT week

Software workshop