Search or filter publications

Filter by type:

Filter by publication type

Filter by year:

to

Results

  • Showing results for:
  • Reset all filters

Search results

  • CONFERENCE PAPER
    Ceran ET, Gunduz D, Gyorgy A, 2018,

    Average age of information with hybrid ARQ under a resource constraint

    , Wireless Communications and Networking Conference (WCNC), Publisher: IEEE, ISSN: 1525-3511

    Scheduling the transmission of status updates over an error-prone communication channel is studied in order to minimize the long-term average age of information (AoI) at the destination under a constraint on the average number of transmissions at the source node. After each transmission, the source receives an instantaneous ACK/NACK feedback, and decides on the next update without prior knowledge on the success of future transmissions. First, the optimal scheduling policy is studied under different feedback mechanisms when the channel statistics are known; in particular, the standard automatic repeat request (ARQ) and hybrid ARQ (HARQ) protocols are considered. Then, for an unknown environment, an average-cost reinforcement learning (RL) algorithm is proposed that learns the system parameters and the transmission policy in real time. The effectiveness of the proposed methods are verified through numerical simulations.

  • JOURNAL ARTICLE
    Chamberlain B, Levy-Kramer J, Humby C, Deisenroth MPet al., 2018,

    Real-time community detection in full social networks on a laptop

    , PLoS ONE, Vol: 13, ISSN: 1932-6203

    For a broad range of research and practical applications it is important to understand the allegiances, communities and structure of key players in society. One promising direction towards extracting this information is to exploit the rich relational data in digital social networks (the social graph). As global social networks (e.g., Facebook and Twitter) are very large, most approaches make use of distributed computing systems for this purpose. Distributing graph processing requires solving many difficult engineering problems, which has lead some researchers to look at single-machine solutions that are faster and easier to maintain. In this article, we present an approach for analyzing full social networks on a standard laptop, allowing for interactive exploration of the communities in the locality of a set of user specified query vertices. The key idea is that the aggregate actions of large numbers of users can be compressed into a data structure that encapsulates the edge weights between vertices in a derived graph. Local communities can be constructed by selecting vertices that are connected to the query vertices with high edge weights in the derived graph. This compression is robust to noise and allows for interactive queries of local communities in real-time, which we define to be less than the average human reaction time of 0.25s. We achieve single-machine real-time performance by compressing the neighborhood of each vertex using minhash signatures and facilitate rapid queries through Locality Sensitive Hashing. These techniques reduce query times from hours using industrial desktop machines operating on the full graph to milliseconds on standard laptops. Our method allows exploration of strongly associated regions (i.e., communities) of large graphs in real-time on a laptop. It has been deployed in software that is actively used by social network analysts and offers another channel for media owners to monetize their data, helping them to continue to provide

  • JOURNAL ARTICLE
    Creswell A, Bharath AA, 2018,

    Denoising Adversarial Autoencoders.

    , IEEE Trans Neural Netw Learn Syst

    Unsupervised learning is of growing interest because it unlocks the potential held in vast amounts of unlabeled data to learn useful representations for inference. Autoencoders, a form of generative model, may be trained by learning to reconstruct unlabeled input data from a latent representation space. More robust representations may be produced by an autoencoder if it learns to recover clean input samples from corrupted ones. Representations may be further improved by introducing regularization during training to shape the distribution of the encoded data in the latent space. We suggest denoising adversarial autoencoders (AAEs), which combine denoising and regularization, shaping the distribution of latent space using adversarial training. We introduce a novel analysis that shows how denoising may be incorporated into the training and sampling of AAEs. Experiments are performed to assess the contributions that denoising makes to the learning of representations for classification and sample synthesis. Our results suggest that autoencoders trained using a denoising criterion achieve higher classification performance and can synthesize samples that are more consistent with the input data than those trained without a corruption process.

  • CONFERENCE PAPER
    Dutordoir V, Salimbeni H, Deisenroth M, Hensman Jet al., 2018,

    Gaussian Process Conditional Density Estimation

    Conditional Density Estimation (CDE) models deal with estimating conditionaldistributions. The conditions imposed on the distribution are the inputs of themodel. CDE is a challenging task as there is a fundamental trade-off betweenmodel complexity, representational capacity and overfitting. In this work, wepropose to extend the model's input with latent variables and use Gaussianprocesses (GP) to map this augmented input onto samples from the conditionaldistribution. Our Bayesian approach allows for the modeling of small datasets,but we also provide the machinery for it to be applied to big data usingstochastic variational inference. Our approach can be used to model densitieseven in sparse data regions, and allows for sharing learned structure betweenconditions. We illustrate the effectiveness and wide-reaching applicability ofour model on a variety of real-world problems, such as spatio-temporal densityestimation of taxi drop-offs, non-Gaussian noise modeling, and few-shotlearning on omniglot images.

  • CONFERENCE PAPER
    Kamthe S, Deisenroth MP, 2018,

    Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control.

    , Artificial Intelligence and Statistics, Publisher: PMLR, Pages: 1701-1710
  • JOURNAL ARTICLE
    Kormushev P, Ugurlu B, Caldwell DG, Tsagarakis NGet al., 2018,

    Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid

    , Autonomous Robots, Pages: 1-17, ISSN: 0929-5593

    © 2018 Springer Science+Business Media, LLC, part of Springer Nature Modern humanoid robots include not only active compliance but also passive compliance. Apart from improved safety and dependability, availability of passive elements, such as springs, opens up new possibilities for improving the energy efficiency. With this in mind, this paper addresses the challenging open problem of exploiting the passive compliance for the purpose of energy efficient humanoid walking. To this end, we develop a method comprising two parts: an optimization part that finds an optimal vertical center-of-mass trajectory, and a walking pattern generator part that uses this trajectory to produce a dynamically-balanced gait. For the optimization part, we propose a reinforcement learning approach that dynamically evolves the policy parametrization during the learning process. By gradually increasing the representational power of the policy parametrization, it manages to find better policies in a faster and computationally efficient way. For the walking generator part, we develop a variable-center-of-mass-height ZMP-based bipedal walking pattern generator. The method is tested in real-world experiments with the bipedal robot COMAN and achieves a significant 18% reduction in the electric energy consumption by learning to efficiently use the passive compliance of the robot.

  • CONFERENCE PAPER
    Olofsson S, Deisenroth MP, Misener R, 2018,

    Design of Experiments for Model Discrimination Hybridising Analytical and Data-Driven Approaches.

    , Publisher: JMLR.org, Pages: 3905-3914
  • JOURNAL ARTICLE
    Olofsson S, Deisenroth MP, Misener R, 2018,

    Design of Experiments for Model Discrimination using Gaussian Process Surrogate Models

    , Vol: 44, Pages: 847-852, ISSN: 1570-7946

    © 2018 Elsevier B.V. Given rival mathematical models and an initial experimental data set, optimal design of experiments for model discrimination discards inaccurate models. Model discrimination is fundamentally about finding out how systems work. Not knowing how a particular system works, or having several rivalling models to predict the behaviour of the system, makes controlling and optimising the system more difficult. The most common way to perform model discrimination is by maximising the pairwise squared difference between model predictions, weighted by measurement noise and model uncertainty resulting from uncertainty in the fitted model parameters. The model uncertainty for analytical model functions is computed using gradient information. We develop a novel method where we replace the black-box models with Gaussian process surrogate models. Using the surrogate models, we are able to approximately marginalise out the model parameters, yielding the model uncertainty. Results show the surrogate model method working for model discrimination for classical test instances.

  • CONFERENCE PAPER
    Pardo F, Tavakoli A, Levdik V, Kormushev Pet al., 2018,

    Time limits in reinforcement learning

    , International Conference on Machine Learning, Pages: 4042-4051

    In reinforcement learning, it is common to let anagent interact for a fixed amount of time with itsenvironment before resetting it and repeating theprocess in a series of episodes. The task that theagent has to learn can either be to maximize itsperformance over (i) that fixed period, or (ii) anindefinite period where time limits are only usedduring training to diversify experience. In thispaper, we provide a formal account for how timelimits could effectively be handled in each of thetwo cases and explain why not doing so can causestate-aliasing and invalidation of experience re-play, leading to suboptimal policies and traininginstability. In case (i), we argue that the termi-nations due to time limits are in fact part of theenvironment, and thus a notion of the remainingtime should be included as part of the agent’s in-put to avoid violation of the Markov property. Incase (ii), the time limits are not part of the envi-ronment and are only used to facilitate learning.We argue that this insight should be incorporatedby bootstrapping from the value of the state atthe end of each partial episode. For both cases,we illustrate empirically the significance of ourconsiderations in improving the performance andstability of existing reinforcement learning algo-rithms, showing state-of-the-art results on severalcontrol tasks.

  • CONFERENCE PAPER
    Sæmundsson S, Hofmann K, Deisenroth MP, 2018,

    Meta reinforcement learning with latent variable Gaussian processes

    , Uncertainty in Artificial Intelligence (UAI) 2018, Publisher: Association for Uncertainty in Artificial Intelligence (AUAI)

    Learning from small data sets is critical inmany practical applications where data col-lection is time consuming or expensive, e.g.,robotics, animal experiments or drug design.Meta learning is one way to increase the dataefficiency of learning algorithms by general-izing learned concepts from a set of trainingtasks to unseen, but related, tasks. Often, thisrelationship between tasks is hard coded or re-lies in some other way on human expertise.In this paper, we frame meta learning as a hi-erarchical latent variable model and infer therelationship between tasks automatically fromdata. We apply our framework in a model-based reinforcement learning setting and showthat our meta-learning model effectively gen-eralizes to novel tasks by identifying how newtasks relate to prior ones from minimal data.This results in up to a60%reduction in theaverage interaction time needed to solve taskscompared to strong baselines.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-t4-html.jsp Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=954&limit=10&respub-action=search.html Current Millis: 1542240927729 Current Time: Thu Nov 15 00:15:27 GMT 2018