2017 Intelligent Sensing Summer School

A three-day event on advanced Intelligent Sensing topics, including wearable sensing, object tracking, sensing groups and crowds, sound processing, affective computing, machine learning, audio-visual sensing, and sensing people.

Participants are expected to attend all three days.

Accomodations nearby the event location: [QMUL campus], [short stay], [hotels].


Where: David Sizer Lecture Theatre, Bancroft building (number 31 in [map]).

Tentative programme

7 September
8:30 Registration
9:15 Welcome and opening
9:30-11:15 session chair: Maryam Abdollahyan
9:30 Initial investigations into characterizing DIY e-textile stretch sensors

An evaluation of three electronic textile (e-textile) stretch sensors: two variations of fabric knit with a stainless steel and polyester yarn, and knit fabric coated with a conductive polymer. Although these materials are accessible to designers and engineers, the properties of each sensor have not before been formally analysed. We evaluate the sensors' performance when being stretched and released.
Sophie Skach Sophie Skach Wearable sensing
Large scale mood and stress self-assessments on a smartwatch

Modern sensing technology is becoming increasingly ubiquitous. We will present an easy-to-use application to log current emotional states on a widely used smartwatch and collect additional, body sensing data to build a basis for new algorithms, interventions and technology-supported therapy around this data to promote emotional and mental well-being.
Katrin Hansel Katrin Hansel
A long short-term memory convolutional neural network for first-person vision activity recognition

We propose a motion representation that uses stacked temporal spectrograms and a long short-term memory (LSTM) network for the recognition of proprioceptive activities in first-person vision (FPV). Experimental results show that the proposed approach achieves state-of-the-art performance in the largest public dataset for FPV activity recognition.
Girmaw Abebe Tadesse Girmaw Abebe Tadesse
10:15
Active visual tracking in multi-agent scenarios

We propose an active visual tracker with collision avoidance for camera-equipped robots in dense multi-agent scenarios. The objective of each tracking agent (robot) is to maintain visual fixation on its moving target while updating its velocity to avoid other agents. We address the problem of robots not having accessible collision-avoiding paths while maintaining the target centred in the field of view and at a certain size.

Yiming Wang Yiming Wang Object tracking

Online multi-target tracking with strong and weak detections

An online multi-target tracker that exploits both high- and low-confidence target detections in a Probability Hypothesis Density Particle Filter framework. Results show that our method outperforms alternative online trackers on the Multiple Object Tracking 2016 and 2015 benchmark datasets in terms tracking accuracy, false negatives and speed.

Ricardo Sanchez-Matilla Ricardo Sanchez-Matilla
10:45
Generic to specific recognition models for membership analysis in group videos

We present an automatic analysis of group membership - i.e., recognizing which group the individual in question is part of - based on a specific recognition model. The model is implemented by a novel two-phase Support Vector Machine (SVM) trained using an optimized generic recognition model.

Wenxuan Mou Wenxuan Mou Sensing groups and crowds

Crowd analysis using visual and non-visual sensors, a survey

A critical survey of crowd analysis techniques using visual and non-visual sensors. This survey identifies different approaches as well as relevant work on crowd analysis, including crowd phenomenon and its dynamics ranging from social, and psychological aspects to computational perspectives.

Muhammad Irfan Muhammad Irfan
11:15 Coffee break
11:30-12:45 session chair: Yiming Wang
11:30 A study on LSTM networks for polyphonic music sequence modelling

We investigate the predictive power of simple long short-term memory (LSTM) networks for polyphonic MIDI sequences, using an empirical approach. Such systems can then be used as a music language model which, combined with an acoustic model, can improve automatic music transcription (AMT) performance. Results are compared in terms of note prediction accuracy.
Adrien Ycart Adrien Ycart Sound processing

Efficient learning of harmonic priors for pitch detection in polyphonic music

We study whether the introduction of physically inspired Gaussian process (GP) priors into audio content analysis models improves the extraction of patterns required for Automatic music transcription (AMT). We demonstrate that what is relevant for improving pitch detection is to learn priors that fit the frequency content of the sound events to detect.

Pablo Alvarado Duran Pablo Alvarado Duran
12:00 Effects of valence and arousal on working memory performance in virtual reality gaming

This work explores how working memory (WM) performance is affected when playing a Virtual Reality (VR) game, and the effects of valence and arousal in this context. Furthermore, a discussion on the application of machine learning to detect affective states based on the player's hand and head motion is presented.
Daniel Gabana Arellano Daniel Gabana Arellano Affective computing
12:15 Class rectification hard mining for imbalanced deep learning

Recognising detailed facial or clothing attributes in images of people is a challenging task for computer vision, especially when the training data are both in very large scale and extremely imbalanced among different attribute classes. To address this problem, we formulate a novel scheme for batch incremental hard sample mining of minority attribute classes from imbalanced large scale training data.
Qi Dong
Qi Dong
Machine learning

L1 graph based sparse model for label de-noising

We propose a novel robust graph-based approach for label de-noising by (i) label smoothing via a visual similarity graph, and (ii) explicitly modelling the label noise pattern. An efficient algorithm is formulated to optimise the proposed model, which contains multiple robust $L_1$ terms in its objective function and is thus non-trivial to optimise.

Xiaobin Chang Xiaobin Chang
14:00

Audio-visual multi-speaker tracking with PHD filtering

Detection and tracking of multiple moving speakers in indoor environments is often required in applications such as automatic camera steering in video conferencing, individual speaker discrimination in multi-speaker environments, and surveillance and monitoring for security. In this lecture, we present some recent development in multi-speaker tracking with audio visual information under the Bayesian framework. In particular, we present adaptive particle filtering, PHD filtering and sparse-sampling based PHD filtering algorithms, and provide demos to show the performance of these tracking algorithms.



Wenwu Wang Wenwu Wang

Invited speaker

Audio-visual sensing
15:00 Coffee break
15:30
Unsupervised cross-modal adaptation for audio-visual target identification with wearable cameras

The increasing availability of body-worn cameras is facilitating applications such as life-logging and activity detection. In particular, recognising objects or the identity of humans from egocentric data is an important capability. Model adaptation is fundamental for wearable devices due to limited training material and rapidly varying operational conditions and target appearances. This talk will discuss the specific issues of audio-visual target identification with wearable cameras and will present an approach to adapt models in an unsupervised and on-line way: each mono-modal model is adapted using the unsupervised labelling provided by the other modality, leveraging on the complementarity of the information available in the audio and visual streams.

Alessio Brutti Alessio Brutti

Invited speaker

16:30 CIS demos


8 September
9:15 Introduction to the day
9:30
Spatial perception for mobile robots

Mobile robots need dedicated sensing and processing for localisation and mapping as well as scene understanding. Recent years have brought tremendous advances in vision sensors (e.g. RGB-D cameras) and processing power (e.g. GPUs) that have led us to design new algorithms that will empower the next generation of mobile robots. With the arrival of deep learning, we are furthermore now in the position to link respective unprecedented performance in scene understanding with 3D mapping. In this talk, I will go through some recent algorithms and software we have developed as well as their application to mobile robots, including drones.



Stefan Leutenegger Stefan Leutenegger

Invited speaker

Mobile sensing
10:30 Coffee break
11:00
Mobile sensing for human behaviour monitoring and mobile health: challenges and applications

With the advent of powerful and inexpensive sensing technology the ability to study human behaviour and activity at large scale and for long periods is becoming a firm reality. Wearables and mobile devices further allow continuous monitoring at unprecedented granularities. This reality generates new challenges but also opens the door to potentially innovative ways of understanding our daily lives. The range of devices and apps released as products in recent years for both medical and general fitness has led to user interest in tracking activity with increased accuracy: this not only revealed the potential of this domain but also highlighted challenges and limitations. In this talk I will discuss our experience in large mobile sensor deployments and analytics in the areas of health and well being.
I will discuss challenges and opportunities at the system, data analytics and inference level and our potential future directions and options.

Cecilia Mascolo Cecilia Mascolo

Invited speaker

14:00
Deep learning for unconstrained face analysis

Recently, methods based on Deep Learning have been shown to produce remarkable performance for a variety of difficult Computer Vision tasks including recognition, detection and semantic segmentation outperforming prior work by a large margin. A key feature of these approaches is that they integrate non-linear hierarchical feature extraction with the classification or regression task in hand being also able to capitalise on the very large datasets that are now readily available. In this talk, I will review the most recent Deep Learning methods for a number of important face analysis tasks, including face detection, 2D and 3D facial landmark localisation (i.e. face alignment), facial part segmentation, 3D face reconstruction, and face recognition, and show how these methods have significantly advanced the state-of-the-art on the most challenging face datasets to date.


Yorgos Tzimiropoulos Georgios Tzimiropoulos

Invited speaker

Sensing people
15:00
Introduction to the challenge: Multi-modal analysis of body-sensor data

Hands-on activity: participants will be divided into groups and compete to solve on an assigned challenge within a limited time span. Solutions will be presented in front of a judging panel that will vote the best groups.

15:30 *Challenge starts*


9 September
13:00 *Challenge submission deadline*
14:00 Groups present the challenge results in front of a judging panel
16:00 Awards and closure


Logistics Filming Sponsor
Muhammad Irfan
Muhammad Irfan
Ricardo Sanchez-Matilla
Ricardo Sanchez-Matilla
Shahnawaz Ahmed
Shahnawaz Ahmed
IET logo


Pictures

2017 CIS Summer School 2017 CIS Summer School 2017 CIS Summer School
2017 CIS Summer School 2017 CIS Summer School 2017 CIS Summer School
2017 CIS Summer School 2017 CIS Summer School 2017 CIS Summer School
2017 CIS Summer School 2017 CIS Summer School 2017 CIS Summer School

[more pictures]



Past events

Summer schools

2016 Summer School

2015 Summer School

2014 Summer School

2013 Summer School


Other events

2016/17 CIS PhD Welcome day

2015/16 CIS PhD Welcome day

CIS Spring Camp 2016

Sensing and graphs week

Commercialisation bootcamp

Sensing and IoT week

Software workshop