arXiv Paper Daily: Mon, 19 Jun 2017

栏目: 数据库 · 发布时间: 6年前

内容简介:arXiv Paper Daily: Mon, 19 Jun 2017

Neural and Evolutionary Computing

The Evolution of Neural Network-Based Chart Patterns: A Preliminary Study

Myoung Hoon Ha , Byung-Ro Moon

Comments: 8 pages, In proceedings of Genetic and Evolutionary Computation Conference (GECCO 2017), Berlin, Germany

Subjects

:

Neural and Evolutionary Computing (cs.NE)

A neural network-based chart pattern represents adaptive parametric features,

including non-linear transformations, and a template that can be applied in the

feature space. The search of neural network-based chart patterns has been

unexplored despite its potential expressiveness. In this paper, we formulate a

general chart pattern search problem to enable cross-representational

quantitative comparison of various search schemes. We suggest a HyperNEAT

framework applying state-of-the-art deep neural network techniques to find

attractive neural network-based chart patterns; These techniques enable a fast

evaluation and search of robust patterns, as well as bringing a performance

gain. The proposed framework successfully found attractive patterns on the

Korean stock market. We compared newly found patterns with those found by

different search schemes, showing the proposed approach has potential.

Evaluating Noisy Optimisation Algorithms: First Hitting Time is Problematic

Simon M. Lucas , JIalin Liu , Diego Pérez-Liébana

Comments: 4 pages, 4 figurs, 1 table

Subjects

:

Neural and Evolutionary Computing (cs.NE)

; Artificial Intelligence (cs.AI)

A key part of any evolutionary algorithm is fitness evaluation. When fitness

evaluations are corrupted by noise, as happens in many real-world problems as a

consequence of various types of uncertainty, a strategy is needed in order to

cope with this. Resampling is one of the most common strategies, whereby each

solution is evaluated many times in order to reduce the variance of the fitness

estimates. When evaluating the performance of a noisy optimisation algorithm, a

key consideration is the stopping condition for the algorithm. A frequently

used stopping condition in runtime analysis, known as “First Hitting Time”, is

to stop the algorithm as soon as it encounters the optimal solution. However,

this is unrealistic for real-world problems, as if the optimal solution were

already known, there would be no need to search for it. This paper argues that

the use of First Hitting Time, despite being a commonly used approach, is

significantly flawed and overestimates the quality of many algorithms in

real-world cases, where the optimum is not known in advance and has to be

genuinely searched for. A better alternative is to measure the quality of the

solution an algorithm returns after a fixed evaluation budget, i.e., to focus

on final solution quality. This paper argues that focussing on final solution

quality is more realistic and demonstrates cases where the results produced by

each algorithm evaluation method lead to very different conclusions regarding

the quality of each noisy optimisation algorithm.

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

Caglar Gulcehre , Francis Dutil , Adam Trischler , Yoshua Bengio

Comments: Accepted to Rep4NLP 2017 Workshop at ACL 2017 Conference

Subjects

:

Computation and Language (cs.CL)

; Neural and Evolutionary Computing (cs.NE)

We investigate the integration of a planning mechanism into an

encoder-decoder architecture with attention for character-level machine

translation. We develop a model that plans ahead when it computes alignments

between the source and target sequences, constructing a matrix of proposed

future alignments and a commitment vector that governs whether to follow or

recompute the plan. This mechanism is inspired by the strategic attentive

reader and writer (STRAW) model. Our proposed model is end-to-end trainable

with fully differentiable operations. We show that it outperforms a strong

baseline on three character-level decoder neural machine translation on WMT’15

corpus. Our analysis demonstrates that our model can compute qualitatively

intuitive alignments and achieves superior performance with fewer parameters.

Gradient Descent for Spiking Neural Networks

Dongsung Huh , Terrence J. Sejnowski Subjects : Neurons and Cognition (q-bio.NC) ; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Much of studies on neural computation are based on network models of static

neurons that produce analog output, despite the fact that information

processing in the brain is predominantly carried out by dynamic neurons that

produce discrete pulses called spikes. Research in spike-based computation has

been impeded by the lack of efficient supervised learning algorithm for spiking

networks. Here, we present a gradient descent method for optimizing spiking

network models by introducing a differentiable formulation of spiking networks

and deriving the exact gradient calculation. For demonstration, we trained

recurrent spiking networks on two dynamic tasks: one that requires optimizing

fast (~millisecond) spike-based interactions for efficient encoding of

information, and a delayed memory XOR task over extended duration (~second).

The results show that our method indeed optimizes the spiking network dynamics

on the time scale of individual spikes as well as behavioral time scales. In

conclusion, our result offers a general purpose supervised learning algorithm

for spiking neural networks, thus advancing further investigations on

spike-based computation.

Computer Vision and Pattern Recognition

Perceptual Generative Adversarial Networks for Small Object Detection

Jianan Li , Xiaodan Liang , Yunchao Wei , Tingfa Xu , Jiashi Feng , Shuicheng Yan Subjects : Computer Vision and Pattern Recognition (cs.CV)

Detecting small objects is notoriously challenging due to their low

resolution and noisy representation. Existing object detection pipelines

usually detect small objects through learning representations of all the

objects at multiple scales. However, the performance gain of such ad hoc

architectures is usually limited to pay off the computational cost. In this

work, we address the small object detection problem by developing a single

architecture that internally lifts representations of small objects to

“super-resolved” ones, achieving similar characteristics as large objects and

thus more discriminative for detection. For this purpose, we propose a new

Perceptual Generative Adversarial Network (Perceptual GAN) model that improves

small object detection through narrowing representation difference of small

objects from the large ones. Specifically, its generator learns to transfer

perceived poor representations of the small objects to super-resolved ones that

are similar enough to real large objects to fool a competing discriminator.

Meanwhile its discriminator competes with the generator to identify the

generated representation and imposes an additional perceptual requirement –

generated representations of small objects must be beneficial for detection

purpose – on the generator. Extensive evaluations on the challenging

Tsinghua-Tencent 100K and the Caltech benchmark well demonstrate the

superiority of Perceptual GAN in detecting small objects, including traffic

signs and pedestrians, over well-established state-of-the-arts.

Multispectral and Hyperspectral Image Fusion Using a 3-D-Convolutional Neural Network

Frosti Palsson , Johannes R. Sveinsson , Magnus O. Ulfarsson Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Machine Learning (stat.ML)

In this paper, we propose a method using a three dimensional convolutional

neural network (3-D-CNN) to fuse together multispectral (MS) and hyperspectral

(HS) images to obtain a high resolution hyperspectral image. Dimensionality

reduction of the hyperspectral image is performed prior to fusion in order to

significantly reduce the computational time and make the method more robust to

noise. Experiments are performed on a data set simulated using a real

hyperspectral image. The results obtained show that the proposed approach is

very promising when compared to conventional methods. This is especially true

when the hyperspectral image is corrupted by additive noise.

Self-ensembling for domain adaptation

Geoffrey French , Michal Mackiewicz , Mark Fisher

Comments: 15 pages, 1 figure, submitted to BMVC 2017

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

This paper explores the use of self-ensembling with random image augmentation

— a technique that has achieved impressive results in the area of

semi-supervised learning — for visual domain adaptation problems. We modify

the approach of Laine et al. to improve stability and ease of use. Our approach

demonstrates state of the art results when performing adaptation between the

following pairs of datasets: MNIST and USPS, CIFAR-10 and STL, SVHN and MNIST,

Syn-Digits to SVHN and Syn-Signs to GTSRB. We also explore the use of richer

data augmentation to solve the challenging MNIST to SVHN adaptation path.

Dynamic Filters in Graph Convolutional Networks

Nitika Verma , Edmond Boyer , Jakob Verbeek Subjects : Computer Vision and Pattern Recognition (cs.CV)

Convolutional neural networks (CNNs) have massively impacted visual

recognition in 2D images, and are now ubiquitous in state-of-the-art

approaches. While CNNs naturally extend to other domains, such as audio and

video, where data is also organized in rectangular grids, they do not easily

generalize to other types of data such as 3D shape meshes, social network

graphs or molecular graphs. To handle such data, we propose a novel

graph-convolutional network architecture that builds on a generic formulation

that relaxes the 1-to-1 correspondence between filter weights and data elements

around the center of the convolution. The main novelty of our architecture is

that the shape of the filter is a function of the features in the previous

network layer, which is learned as an integral part of the neural network.

Experimental evaluations on digit recognition, semi-supervised document

classification, and 3D shape correspondence yield state-of-the-art results,

significantly improving over previous work for shape correspondence.

Interactive 3D Modeling with a Generative Adversarial Network

Jerry Liu , Fisher Yu , Thomas Funkhouser Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Graphics (cs.GR)

This paper proposes the idea of using a generative adversarial network (GAN)

to assist a novice user in designing real-world shapes with a simple interface.

The user edits a voxel grid with a painting interface (like Minecraft). Yet, at

any time, he/she can execute a SNAP command, which projects the current voxel

grid onto a latent shape manifold with a learned projection operator and then

generates a similar, but more realistic, shape using a learned generator

network. Then the user can edit the resulting shape and snap again until he/she

is satisfied with the result. The main advantage of this approach is that the

projection and generation operators assist novice users to create 3D models

characteristic of a background distribution of object shapes, but without

having to specify all the details. The core new research idea is to use a GAN

to support this application. 3D GANs have previously been used for shape

generation, interpolation, and completion, but never for interactive modeling.

The new challenge for this application is to learn a projection operator that

takes an arbitrary 3D voxel model and produces a latent vector on the shape

manifold from which a similar and realistic shape can be generated. We develop

algorithms for this and other steps of the SNAP processing pipeline and

integrate them into a simple modeling tool. Experiments with these algorithms

and tool suggest that GANs provide a promising approach to computer-assisted

interactive modeling.

A Fully Trainable Network with RNN-based Pooling

Shuai Li , Wanqing Li , Chris Cook , Ce Zhu , Yanbo Gao

Comments: 17 pages, 5 figures, 4 tables

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Pooling is an important component in convolutional neural networks (CNNs) for

aggregating features and reducing computational burden. Compared with other

components such as convolutional layers and fully connected layers which are

completely learned from data, the pooling component is still handcrafted such

as max pooling and average pooling. This paper proposes a learnable pooling

function using recurrent neural networks (RNN) so that the pooling can be fully

adapted to data and other components of the network, leading to an improved

performance. Such a network with learnable pooling function is referred to as a

fully trainable network (FTN). Experimental results have demonstrated that the

proposed RNN-based pooling can well approximate the existing pooling functions

and improve the performance of the network. Especially for small networks, the

proposed FTN can improve the performance by seven percentage points in terms of

error rate on the CIFAR-10 dataset compared with the traditional CNN.

The Monkeytyping Solution to the YouTube-8M Video Understanding Challenge

He-Da Wang , Teng Zhang , Ji Wu

Comments: Submitted to the CVPR 2017 Workshop on YouTube-8M Large-Scale Video Understanding

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

This article describes the final solution of team monkeytyping, who finished

in second place in the YouTube-8M video understanding challenge. The dataset

used in this challenge is a large-scale benchmark for multi-label video

classification. We extend the work in [1] and propose several improvements for

frame sequence modeling. We propose a network structure called Chaining that

can better capture the interactions between labels. Also, we report our

approaches in dealing with multi-scale information and attention pooling. In

addition, We find that using the output of model ensemble as a side target in

training can boost single model performance. We report our experiments in

bagging, boosting, cascade, and stacking, and propose a stacking algorithm

called attention weighted stacking. Our final submission is an ensemble that

consists of 74 sub models, all of which are listed in the appendix.

Symplectomorphic registration with phase space regularization by entropy spectrum pathways

Vitaly L. Galinsky , Lawrence R. Frank

Comments: 26 pages, 7 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

The ability to register image data to a common coordinate system is a

critical feature of virtually all imaging studies that require multiple subject

analysis, combining single subject data from multiple modalities, or both.

However, in spite of the abundance of literature on the subject and the

existence of several variants of registration algorithms, their practical

utility remains problematic, as commonly acknowledged even by developers of

these methods because the complexity of the problem has resisted a general,

flexible, and robust theoretical and computational framework.

To address this issue, we present a new registration method that is similar

in spirit to the current state-of-the-art technique of diffeomorphic mapping,

but is more general and flexible. The method utilizes a Hamiltonian formalism

and constructs registration as a sequence of symplectomorphic maps in

conjunction with a novel phase space regularization based on the powerful

entropy spectrum pathways (ESP) framework.

The method is demonstrated on the three different magnetic resonance imaging

(MRI) modalities routinely used for human neuroimaging applications by mapping

between high resolution anatomical (HRA) volumes, medium resolution diffusion

weighted MRI (DW-MRI) and HRA volumes, and low resolution functional MRI (fMRI)

and HRA volumes. The typical processing time for high quality mapping ranges

from less than a minute to several minutes on a modern multi core CPU for

typical high resolution anatomical (~256x256x256 voxels) MRI volumes.

Face Clustering: Representation and Pairwise Constraints

Yichun Shi , Charles Otto , Anil K. Jain

Comments: 13 pages, journal paper

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Clustering face images according to their identity has two important

applications: (i) grouping a collection of face images when no external labels

are associated with images, and (ii) indexing for efficient large scale face

retrieval. The clustering problem is composed of two key parts: face

representation and choice of similarity for grouping faces. We first propose a

representation based on ResNet, which has been shown to perform very well in

image classification problems. Given this representation, we design a

clustering algorithm, Conditional Pairwise Clustering (ConPaC), which directly

estimates the adjacency matrix only based on the similarity between face

images. This allows a dynamic selection of number of clusters and retains

pairwise similarity between faces. ConPaC formulates the clustering problem as

a Conditional Random Field (CRF) model and uses Loopy Belief Propagation to

find an approximate solution for maximizing the posterior probability of the

adjacency matrix. Experimental results on two benchmark face datasets (LFW and

IJB-B) show that ConPaC outperforms well known clustering algorithms such as

k-means, spectral clustering and approximate rank-order. Additionally, our

algorithm can naturally incorporate pairwise constraints to obtain a

semi-supervised version that leads to improved clustering performance. We also

propose an k-NN variant of ConPaC, which has a linear time complexity given a

k-NN graph, suitable for large datasets.

Hierarchical Label Inference for Video Classification

Nelson Nauata , Jonathan Smith , Greg Mori Subjects : Computer Vision and Pattern Recognition (cs.CV)

Videos are a rich source of high-dimensional structured data, with a wide

range of interacting components at varying levels of granularity. In order to

improve understanding of unconstrained internet videos, it is important to

consider the role of labels at separate levels of abstraction. In this paper,

we consider the use of the Bidirectional Inference Neural Network (BINN) for

performing graph-based inference in label space for the task of video

classification. We take advantage of the inherent hierarchy between labels at

increasing granularity. The BINN is evaluated on the first and second release

of the YouTube-8M large scale multilabel video dataset. Our results demonstrate

the effectiveness of BINN, achieving significant improvements against baseline

models.

Robotic Ironing with 3D Perception and Force/Torque Feedback in Household Environments

David Estevez , Juan G. Victores , Raul Fernandez-Fernandez , Carlos Balaguer

Comments: Accepted and to be published on the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017) that will be held in Vancouver, Canada, September 24-28, 2017

Subjects

:

Robotics (cs.RO)

; Computer Vision and Pattern Recognition (cs.CV)

As robotic systems become more popular in household environments, the

complexity of required tasks also increases. In this work we focus on a

domestic chore deemed dull by a majority of the population, the task of

ironing. The presented algorithm improves on the limited number of previous

works by joining 3D perception with force/torque sensing, with emphasis on

finding a practical solution with a feasible implementation in a domestic

setting. Our algorithm obtains a point cloud representation of the working

environment. From this point cloud, the garment is segmented and a custom

Wrinkleness Local Descriptor (WiLD) is computed to determine the location of

the present wrinkles. Using this descriptor, the most suitable ironing path is

computed and, based on it, the manipulation algorithm performs the

force-controlled ironing operation. Experiments have been performed with a

humanoid robot platform, proving that our algorithm is able to detect

successfully wrinkles present in garments and iteratively reduce the

wrinkleness using an unmodified iron.

A new look at clustering through the lens of deep convolutional neural networks

Ali Borji , Aysegul Dundar Subjects : Learning (cs.LG) ; Computer Vision and Pattern Recognition (cs.CV)

Classification and clustering have been studied separately in machine

learning and computer vision. Inspired by the recent success of deep learning

models in solving various vision problems (e.g., object recognition, semantic

segmentation) and the fact that humans serve as the gold standard in assessing

clustering algorithms, here, we advocate for a unified treatment of the two

problems and suggest that hierarchical frameworks that progressively build

complex patterns on top of the simpler ones (e.g., convolutional neural

networks) offer a promising solution. We do not dwell much on the learning

mechanisms in these frameworks as they are still a matter of debate, with

respect to biological constraints. Instead, we emphasize on the

compositionality of the real world structures and objects. In particular, we

show that CNNs, trained end to end using back propagation with noisy labels,

are able to cluster data points belonging to several overlapping shapes, and do

so much better than the state of the art algorithms. The main takeaway lesson

from our study is that mechanisms of human vision, particularly the hierarchal

organization of the visual ventral stream should be taken into account in

clustering algorithms (e.g., for learning representations in an unsupervised

manner or with minimum supervision) to reach human level clustering

performance. This, by no means, suggests that other methods do not hold merits.

For example, methods relying on pairwise affinities (e.g., spectral clustering)

have been very successful in many cases but still fail in some cases (e.g.,

overlapping clusters).

Distance weighted discrimination of face images for gender classification

Mónica Benito , Eduardo García-Portugués , J. S. Marron , Daniel Peña

Comments: 9 pages, 4 figures, 1 table

Subjects

:

Applications (stat.AP)

; Computer Vision and Pattern Recognition (cs.CV); Methodology (stat.ME)

We illustrate the advantages of distance weighted discrimination for

classification and feature extraction in a High Dimension Low Sample Size

(HDLSS) situation. The HDLSS context is a gender classification problem of face

images in which the dimension of the data is several orders of magnitude larger

than the sample size. We compare distance weighted discrimination with Fisher’s

linear discriminant, support vector machines, and principal component analysis

by exploring their classification interpretation through insightful

visuanimations and by examining the classifiers’ discriminant errors. This

analysis enables us to make new contributions to the understanding of the

drivers of human discrimination between males and females.

Artificial Intelligence

Value-Decomposition Networks For Cooperative Multi-Agent Learning

Peter Sunehag , Guy Lever , Audrunas Gruslys , Wojciech Marian Czarnecki , Vinicius Zambaldi , Max Jaderberg , Marc Lanctot , Nicolas Sonnerat , Joel Z. Leibo , Karl Tuyls , Thore Graepel Subjects : Artificial Intelligence (cs.AI)

We study the problem of cooperative multi-agent reinforcement learning with a

single joint reward signal. This class of learning problems is difficult

because of the often large combined action and observation spaces. In the fully

centralized and decentralized approaches, we find the problem of spurious

rewards and a phenomenon we call the “lazy agent” problem, which arises due to

partial observability. We address these problems by training individual agents

with a novel value decomposition network architecture, which learns to

decompose the team value function into agent-wise value functions. We perform

an experimental evaluation across a range of partially-observable multi-agent

domains and show that learning such value-decompositions leads to superior

results, in particular when combined with weight sharing, role information and

information channels.

From Propositional Logic to Plausible Reasoning: A Uniqueness Theorem

Kevin S. Van Horn

Comments: Submitted to Int’l Journal of Approximate Reasoning

Subjects

:

Artificial Intelligence (cs.AI)

; Logic in Computer Science (cs.LO)

We consider the question of extending propositional logic to a logic of

plausible reasoning, and posit four requirements that any such extension should

satisfy. Each is a requirement that some property of classical propositional

logic be preserved in the extended logic; as such, the requirements are simpler

and less problematic than those used in Cox’s Theorem and its variants. As with

Cox’s Theorem, our requirements imply that the extended logic must be

isomorphic to (finite-set) probability theory. We also obtain specific

numerical values for the probabilities, recovering the classical definition of

probability as a theorem, with truth assignments that satisfy the premise

playing the role of the “possible cases.”

Improving Scalability of Inductive Logic Programming via Pruning and Best-Effort Optimisation

Mishal Kazmi , Peter Schüller , Yücel Saygın

Comments: 24 pages, preprint of article accepted at Expert Systems With Applications

Subjects

:

Artificial Intelligence (cs.AI)

Inductive Logic Programming (ILP) combines rule-based and statistical

artificial intelligence methods, by learning a hypothesis comprising a set of

rules given background knowledge and constraints for the search space. We focus

on extending the XHAIL algorithm for ILP which is based on Answer Set

Programming and we evaluate our extensions using the Natural Language

Processing application of sentence chunking. With respect to processing natural

language, ILP can cater for the constant change in how we use language on a

daily basis. At the same time, ILP does not require huge amounts of training

examples such as other statistical methods and produces interpretable results,

that means a set of rules, which can be analysed and tweaked if necessary. As

contributions we extend XHAIL with (i) a pruning mechanism within the

hypothesis generalisation algorithm which enables learning from larger

datasets, (ii) a better usage of modern solver technology using recently

developed optimisation methods, and (iii) a time budget that permits the usage

of suboptimal results. We evaluate these improvements on the task of sentence

chunking using three datasets from a recent SemEval competition. Results show

that our improvements allow for learning on bigger datasets with results that

are of similar quality to state-of-the-art systems on the same task. Moreover,

we compare the hypotheses obtained on datasets to gain insights on the

structure of each dataset.

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Mike Lewis , Denis Yarats , Yann N. Dauphin , Devi Parikh , Dhruv Batra Subjects : Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL)

Much of human dialogue occurs in semi-cooperative settings, where agents with

different goals attempt to agree on common decisions. Negotiations require

complex communication and reasoning skills, but success is easy to measure,

making this an interesting task for AI. We gather a large dataset of

human-human negotiations on a multi-issue bargaining task, where agents who

cannot observe each other’s reward functions must reach an agreement (or a

deal) via natural language dialogue. For the first time, we show it is possible

to train end-to-end models for negotiation, which must learn both linguistic

and reasoning skills with no annotated dialogue states. We also introduce

dialogue rollouts, in which the model plans ahead by simulating possible

complete continuations of the conversation, and find that this technique

dramatically improves performance. Our code and dataset are publicly available

( this https URL ).

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

Junhyuk Oh , Satinder Singh , Honglak Lee , Pushmeet Kohli

Comments: ICML 2017

Subjects

:

Artificial Intelligence (cs.AI)

; Learning (cs.LG)

As a step towards developing zero-shot task generalization capabilities in

reinforcement learning (RL), we introduce a new RL problem where the agent

should learn to execute sequences of instructions after learning useful skills

that solve subtasks. In this problem, we consider two types of generalizations:

to previously unseen instructions and to longer sequences of instructions. For

generalization over unseen instructions, we propose a new objective which

encourages learning correspondences between similar subtasks by making

analogies. For generalization over sequential instructions, we present a

hierarchical architecture where a meta controller learns to use the acquired

skills for executing the instructions. To deal with delayed reward, we propose

a new neural architecture in the meta controller that learns when to update the

subtask, which makes learning more efficient. Experimental results on a

stochastic 3D domain show that the proposed ideas are crucial for

generalization to longer instructions as well as unseen instructions.

Conjunctions of Among Constraints

Victor Dalmau

Comments: 15 pages plus appendix

Subjects

:

Artificial Intelligence (cs.AI)

; Logic in Computer Science (cs.LO)

Many existing global constraints can be encoded as a conjunction of among

constraints. An among constraint holds if the number of the variables in its

scope whose value belongs to a prespecified set, which we call its range, is

within some given bounds. It is known that domain filtering algorithms can

benefit from reasoning about the interaction of among constraints so that

values can be filtered out taking into consideration several among constraints

simultaneously. The present pa- per embarks into a systematic investigation on

the circumstances under which it is possible to obtain efficient and complete

domain filtering algorithms for conjunctions of among constraints. We start by

observing that restrictions on both the scope and the range of the among

constraints are necessary to obtain meaningful results. Then, we derive a

domain flow-based filtering algorithm and present several applications. In

particular, it is shown that the algorithm unifies and generalizes several

previous existing results.

Collaborative vehicle routing: a survey

Margaretha Gansterer , Richard F. Hartl Subjects : Multiagent Systems (cs.MA) ; Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Optimization and Control (math.OC); Physics and Society (physics.soc-ph)

In horizontal collaborations, carriers form coalitions in order to perform

parts of their logistics operations jointly. By exchanging transportation

requests among each other, they can operate more efficiently and in a more

sustainable way. Collaborative vehicle routing has been extensively discussed

in the literature. We identify three major streams of research: (i) centralized

collaborative planning, (ii) decentralized planning without auctions, and (ii)

auction-based decentralized planning. For each of them we give a structured

overview on the state of knowledge and discuss future research directions.

Structured Best Arm Identification with Fixed Confidence

Ruitong Huang , Mohammad M. Ajallooeian , Csaba Szepesvári , Martin Müller Subjects : Learning (cs.LG) ; Artificial Intelligence (cs.AI)

We study the problem of identifying the best action among a set of possible

options when the value of each action is given by a mapping from a number of

noisy micro-observables in the so-called fixed confidence setting. Our main

motivation is the application to the minimax game search, which has been a

major topic of interest in artificial intelligence. In this paper we introduce

an abstract setting to clearly describe the essential properties of the

problem. While previous work only considered a two-move game tree search

problem, our abstract setting can be applied to the general minimax games where

the depth can be non-uniform and arbitrary, and transpositions are allowed. We

introduce a new algorithm (LUCB-micro) for the abstract setting, and give its

lower and upper sample complexity results. Our bounds recover some previous

results, which were only available in more limited settings, while they also

shed further light on how the structure of minimax problems influence sample

complexity.

AI-Powered Social Bots

Terrence Adams

Comments: 2 figures

Subjects

:

Social and Information Networks (cs.SI)

; Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Multimedia (cs.MM)

This paper gives an overview of impersonation bots that generate output in

one, or possibly, multiple modalities. We also discuss rapidly advancing areas

of machine learning and artificial intelligence that could lead to

frighteningly powerful new multi-modal social bots. Our main conclusion is that

most commonly known bots are one dimensional (i.e., chatterbot), and far from

deceiving serious interrogators. However, using recent advances in machine

learning, it is possible to unleash incredibly powerful, human-like armies of

social bots, in potentially well coordinated campaigns of deception and

influence.

Bib2vec: An Embedding-based Search System for Bibliographic Information

Takuma Yoneda , Koki Mori , Makoto Miwa , Yutaka Sasaki

Comments: EACL2017 extended version

Journal-ref: Proceedings of the EACL 2017 Software Demonstrations, Valencia,

Spain, April 3-7 2017, pages 112-115

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

We propose a novel embedding model that represents relationships among

several elements in bibliographic information with high representation ability

and flexibility. Based on this model, we present a novel search system that

shows the relationships among the elements in the ACL Anthology Reference

Corpus. The evaluation results show that our model can achieve a high

prediction ability and produce reasonable search results.

An Overview of Multi-Task Learning in Deep Neural Networks

Sebastian Ruder

Comments: 14 pages, 8 figures

Subjects

:

Learning (cs.LG)

; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Multi-task learning (MTL) has led to successes in many applications of

machine learning, from natural language processing and speech recognition to

computer vision and drug discovery. This article aims to give a general

overview of MTL, particularly in deep neural networks. It introduces the two

most common methods for MTL in Deep Learning, gives an overview of the

literature, and discusses recent advances. In particular, it seeks to help ML

practitioners apply MTL by shedding light on how MTL works and providing

guidelines for choosing appropriate auxiliary tasks.

Evaluating Noisy Optimisation Algorithms: First Hitting Time is Problematic

Simon M. Lucas , JIalin Liu , Diego Pérez-Liébana

Comments: 4 pages, 4 figurs, 1 table

Subjects

:

Neural and Evolutionary Computing (cs.NE)

; Artificial Intelligence (cs.AI)

A key part of any evolutionary algorithm is fitness evaluation. When fitness

evaluations are corrupted by noise, as happens in many real-world problems as a

consequence of various types of uncertainty, a strategy is needed in order to

cope with this. Resampling is one of the most common strategies, whereby each

solution is evaluated many times in order to reduce the variance of the fitness

estimates. When evaluating the performance of a noisy optimisation algorithm, a

key consideration is the stopping condition for the algorithm. A frequently

used stopping condition in runtime analysis, known as “First Hitting Time”, is

to stop the algorithm as soon as it encounters the optimal solution. However,

this is unrealistic for real-world problems, as if the optimal solution were

already known, there would be no need to search for it. This paper argues that

the use of First Hitting Time, despite being a commonly used approach, is

significantly flawed and overestimates the quality of many algorithms in

real-world cases, where the optimum is not known in advance and has to be

genuinely searched for. A better alternative is to measure the quality of the

solution an algorithm returns after a fixed evaluation budget, i.e., to focus

on final solution quality. This paper argues that focussing on final solution

quality is more realistic and demonstrates cases where the results produced by

each algorithm evaluation method lead to very different conclusions regarding

the quality of each noisy optimisation algorithm.

Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme

Suncong Zheng , Feng Wang , Hongyun Bao , Yuexing Hao , Peng Zhou , Bo Xu Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Learning (cs.LG)

Joint extraction of entities and relations is an important task in

information extraction. To tackle this problem, we firstly propose a novel

tagging scheme that can convert the joint extraction task to a tagging problem.

Then, based on our tagging scheme, we study different end-to-end models to

extract entities and their relations directly, without identifying entities and

relations separately. We conduct experiments on a public dataset produced by

distant supervision method and the experimental results show that the tagging

based methods are better than most of the existing pipelined and joint learning

methods. What’s more, the end-to-end model proposed in this paper, achieves the

best results on the public dataset.

Information Retrieval

Twigraph: Discovering and Visualizing Influential Words between Twitter Profiles

Dhanasekar S , Sudharshan Srinivasan Subjects : Social and Information Networks (cs.SI) ; Information Retrieval (cs.IR)

The social media craze is on an ever increasing spree, and people are

connected with each other like never before, but these vast connections are

visually unexplored. We propose a methodology Twigraph to explore the

connections between persons using their Twitter profiles. First, we propose a

hybrid approach of recommending social media profiles, articles, and

advertisements to a user.The profiles are recommended based on the similarity

score between the user profile, and profile under evaluation. The similarity

between a set of profiles is investigated by finding the top influential words

thus causing a high similarity through an Influence Term Metric for each word.

Then, we group profiles of various domains such as politics, sports, and

entertainment based on the similarity score through a novel clustering

algorithm. The connectivity between profiles is envisaged using word graphs

that help in finding the words that connect a set of profiles and the profiles

that are connected to a word. Finally, we analyze the top influential words

over a set of profiles through clustering by finding the similarity of that

profiles enabling to break down a Twitter profile with a lot of followers to

fine level word connections using word graphs. The proposed method was

implemented on datasets comprising 1.1 M Tweets obtained from Twitter.

Experimental results show that the resultant influential words were highly

representative of the relationship between two profiles or a set of profiles

Bib2vec: An Embedding-based Search System for Bibliographic Information

Takuma Yoneda , Koki Mori , Makoto Miwa , Yutaka Sasaki

Comments: EACL2017 extended version

Journal-ref: Proceedings of the EACL 2017 Software Demonstrations, Valencia,

Spain, April 3-7 2017, pages 112-115

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

We propose a novel embedding model that represents relationships among

several elements in bibliographic information with high representation ability

and flexibility. Based on this model, we present a novel search system that

shows the relationships among the elements in the ACL Anthology Reference

Corpus. The evaluation results show that our model can achieve a high

prediction ability and produce reasonable search results.

Topic supervised non-negative matrix factorization

Kelsey MacMillan , James D. Wilson

Comments: 20 pages, 7 figures

Subjects

:

Computation and Language (cs.CL)

; Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

Topic models have been extensively used to organize and interpret the

contents of large, unstructured corpora of text documents. Although topic

models often perform well on traditional training vs. test set evaluations, it

is often the case that the results of a topic model do not align with human

interpretation. This interpretability fallacy is largely due to the

unsupervised nature of topic models, which prohibits any user guidance on the

results of a model. In this paper, we introduce a semi-supervised method called

topic supervised non-negative matrix factorization (TS-NMF) that enables the

user to provide labeled example documents to promote the discovery of more

meaningful semantic structure of a corpus. In this way, the results of TS-NMF

better match the intuition and desired labeling of the user. The core of TS-NMF

relies on solving a non-convex optimization problem for which we derive an

iterative algorithm that is shown to be monotonic and convergent to a local

optimum. We demonstrate the practical utility of TS-NMF on the Reuters and

PubMed corpora, and find that TS-NMF is especially useful for conceptual or

broad topics, where topic key terms are not well understood. Although

identifying an optimal latent structure for the data is not a primary objective

of the proposed approach, we find that TS-NMF achieves higher weighted Jaccard

similarity scores than the contemporary methods, (unsupervised) NMF and latent

Dirichlet allocation, at supervision rates as low as 10% to 20%.

Measuring Personalization of Web Search

Anikó Hannák , Piotr Sapieżyński , Arash Molavi Khaki , David Lazer , Alan Mislove , Christo Wilson Subjects : Computers and Society (cs.CY) ; Information Retrieval (cs.IR)

Web search is an integral part of our daily lives. Recently, there has been a

trend of personalization in Web search, where different users receive different

results for the same search query. The increasing level of personalization is

leading to concerns about Filter Bubble effects, where certain users are simply

unable to access information that the search engines’ algorithm decides is

irrelevant. Despite these concerns, there has been little quantification of the

extent of personalization in Web search today, or the user attributes that

cause it.

In light of this situation, we make three contributions. First, we develop a

methodology for measuring personalization in Web search results. While

conceptually simple, there are numerous details that our methodology must

handle in order to accurately attribute differences in search results to

personalization. Second, we apply our methodology to 200 users on Google Web

Search and 100 users on Bing. We find that, on average, 11.7% of results show

differences due to personalization on Google, while 15.8% of results are

personalized on Bing, but that this varies widely by search query and by result

ranking. Third, we investigate the user features used to personalize on Google

Web Search and Bing. Surprisingly, we only find measurable personalization as a

result of searching with a logged in account and the IP address of the

searching user. Our results are a first step towards understanding the extent

and effects of personalization on Web search engines today.

Computation and Language

An Automatic Approach for Document-level Topic Model Evaluation

Shraey Bhatia , Jey Han Lau , Timothy Baldwin

Comments: 10 pages; accepted for the Twenty First Conference on Computational Natural Language Learning (CoNLL 2017)

Subjects

:

Computation and Language (cs.CL)

Topic models jointly learn topics and document-level topic distribution.

Extrinsic evaluation of topic models tends to focus exclusively on topic-level

evaluation, e.g. by assessing the coherence of topics. We demonstrate that

there can be large discrepancies between topic- and document-level model

quality, and that basing model evaluation on topic-level analysis can be highly

misleading. We propose a method for automatically predicting topic model

quality based on analysis of document-level topic allocations, and provide

empirical evidence for its robustness.

Bib2vec: An Embedding-based Search System for Bibliographic Information

Takuma Yoneda , Koki Mori , Makoto Miwa , Yutaka Sasaki

Comments: EACL2017 extended version

Journal-ref: Proceedings of the EACL 2017 Software Demonstrations, Valencia,

Spain, April 3-7 2017, pages 112-115

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

We propose a novel embedding model that represents relationships among

several elements in bibliographic information with high representation ability

and flexibility. Based on this model, we present a novel search system that

shows the relationships among the elements in the ACL Anthology Reference

Corpus. The evaluation results show that our model can achieve a high

prediction ability and produce reasonable search results.

A Mixture Model for Learning Multi-Sense Word Embeddings

Dai Quoc Nguyen , Dat Quoc Nguyen , Ashutosh Modi , Stefan Thater , Manfred Pinkal

Comments: *SEM 2017

Subjects

:

Computation and Language (cs.CL)

Word embeddings are now a standard technique for inducing meaning

representations for words. For getting good representations, it is important to

take into account different senses of a word. In this paper, we propose a

mixture model for learning multi-sense word embeddings. Our model generalizes

the previous works in that it allows to induce different weights of different

senses of a word. The experimental results show that our model outperforms

previous models on standard evaluation tasks.

Number game

Go Sugimoto (ACDH-ÖAW) Subjects : Computation and Language (cs.CL) ; Digital Libraries (cs.DL)

CLARIN (Common Language Resources and Technology Infrastructure) is regarded

as one of the most important European research infrastructures, offering and

promoting a wide array of useful services for (digital) research in linguistics

and humanities. However, the assessment of the users for its core technical

development has been highly limited, therefore, it is unclear if the community

is thoroughly aware of the status-quo of the growing infrastructure. In

addition, CLARIN does not seem to be fully materialised marketing and business

plans and strategies despite its strong technical assets. This article analyses

the web traffic of the Virtual Language Observatory, one of the main web

applications of CLARIN and a symbol of pan-European re-search cooperation, to

evaluate the users and performance of the service in a transparent and

scientific way. It is envisaged that the paper can raise awareness of the

pressing issues on objective and transparent operation of the infrastructure

though Open Evaluation, and the synergy between marketing and technical

development. It also investigates the “science of web analytics” in an attempt

to document the research process for the purpose of reusability and

reproducibility, thus to find universal lessons for the use of a web analytics,

rather than to merely produce a statistical report of a particular website

which loses its value outside its context.

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

Caglar Gulcehre , Francis Dutil , Adam Trischler , Yoshua Bengio

Comments: Accepted to Rep4NLP 2017 Workshop at ACL 2017 Conference

Subjects

:

Computation and Language (cs.CL)

; Neural and Evolutionary Computing (cs.NE)

We investigate the integration of a planning mechanism into an

encoder-decoder architecture with attention for character-level machine

translation. We develop a model that plans ahead when it computes alignments

between the source and target sequences, constructing a matrix of proposed

future alignments and a commitment vector that governs whether to follow or

recompute the plan. This mechanism is inspired by the strategic attentive

reader and writer (STRAW) model. Our proposed model is end-to-end trainable

with fully differentiable operations. We show that it outperforms a strong

baseline on three character-level decoder neural machine translation on WMT’15

corpus. Our analysis demonstrates that our model can compute qualitatively

intuitive alignments and achieves superior performance with fewer parameters.

Topic supervised non-negative matrix factorization

Kelsey MacMillan , James D. Wilson

Comments: 20 pages, 7 figures

Subjects

:

Computation and Language (cs.CL)

; Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

Topic models have been extensively used to organize and interpret the

contents of large, unstructured corpora of text documents. Although topic

models often perform well on traditional training vs. test set evaluations, it

is often the case that the results of a topic model do not align with human

interpretation. This interpretability fallacy is largely due to the

unsupervised nature of topic models, which prohibits any user guidance on the

results of a model. In this paper, we introduce a semi-supervised method called

topic supervised non-negative matrix factorization (TS-NMF) that enables the

user to provide labeled example documents to promote the discovery of more

meaningful semantic structure of a corpus. In this way, the results of TS-NMF

better match the intuition and desired labeling of the user. The core of TS-NMF

relies on solving a non-convex optimization problem for which we derive an

iterative algorithm that is shown to be monotonic and convergent to a local

optimum. We demonstrate the practical utility of TS-NMF on the Reuters and

PubMed corpora, and find that TS-NMF is especially useful for conceptual or

broad topics, where topic key terms are not well understood. Although

identifying an optimal latent structure for the data is not a primary objective

of the proposed approach, we find that TS-NMF achieves higher weighted Jaccard

similarity scores than the contemporary methods, (unsupervised) NMF and latent

Dirichlet allocation, at supervision rates as low as 10% to 20%.

Ensembling Factored Neural Machine Translation Models for Automatic Post-Editing and Quality Estimation

Chris Hokamp

Comments: APE/QE System Description Paper for WMT/CMT 2017

Subjects

:

Computation and Language (cs.CL)

This work presents a novel approach to jointly tackling Automatic

Post-Editing (APE) and Word-Level Quality Estimation (QE) using ensembles of

specialized Neural Machine Translation (NMT) systems. Word-level features which

have proven effective for QE are included as input factors, expanding the

representation of the original source and the machine translation hypothesis,

which are used to generate an automatically post-edited hypothesis. We train a

suite of NMT models which use different input representations, but share the

same output space. These models are then ensembled together, and tuned for both

the APE and the QE task. We thus attempt to connect the state-of-the-art

approaches to APE and QE within a single framework. Our models achieve

state-of-the-art results in both tasks, with the only difference in the tuning

step which learns weights for each component of the ensemble.

Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme

Suncong Zheng , Feng Wang , Hongyun Bao , Yuexing Hao , Peng Zhou , Bo Xu Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Learning (cs.LG)

Joint extraction of entities and relations is an important task in

information extraction. To tackle this problem, we firstly propose a novel

tagging scheme that can convert the joint extraction task to a tagging problem.

Then, based on our tagging scheme, we study different end-to-end models to

extract entities and their relations directly, without identifying entities and

relations separately. We conduct experiments on a public dataset produced by

distant supervision method and the experimental results show that the tagging

based methods are better than most of the existing pipelined and joint learning

methods. What’s more, the end-to-end model proposed in this paper, achieves the

best results on the public dataset.

Extracting Formal Models from Normative Texts

John J. Camilleri , Normunds Grūzītis , Gerardo Schneider

Comments: Extended version of conference paper at the 21st International Conference on Applications of Natural Language to Information Systems (NLDB 2016). arXiv admin note: substantial text overlap with arXiv:1607.01485

Subjects

:

Computation and Language (cs.CL)

We are concerned with the analysis of normative texts – documents based on

the deontic notions of obligation, permission, and prohibition. Our goal is to

make queries about these notions and verify that a text satisfies certain

properties concerning causality of actions and timing constraints. This

requires taking the original text and building a representation (model) of it

in a formal language, in our case the C-O Diagram formalism. We present an

experimental, semi-automatic aid that helps to bridge the gap between a

normative text in natural language and its C-O Diagram representation. Our

approach consists of using dependency structures obtained from the

state-of-the-art Stanford Parser, and applying our own rules and heuristics in

order to extract the relevant components. The result is a tabular data

structure where each sentence is split into suitable fields, which can then be

converted into a C-O Diagram. The process is not fully automatic however, and

some post-editing is generally required of the user. We apply our tool and

perform experiments on documents from different domains, and report an initial

evaluation of the accuracy and feasibility of our approach.

Active learning in annotating micro-blogs dealing with e-reputation

Jean-Valèere Cossu , Alejandro Molina-Villegas , Mariana Tello-Signoret Subjects : Social and Information Networks (cs.SI) ; Computation and Language (cs.CL)

Elections unleash strong political views on Twitter, but what do people

really think about politics? Opinion and trend mining on micro blogs dealing

with politics has recently attracted researchers in several fields including

Information Retrieval and Machine Learning (ML). Since the performance of ML

and Natural Language Processing (NLP) approaches are limited by the amount and

quality of data available, one promising alternative for some tasks is the

automatic propagation of expert annotations. This paper intends to develop a

so-called active learning process for automatically annotating French language

tweets that deal with the image (i.e., representation, web reputation) of

politicians. Our main focus is on the methodology followed to build an original

annotated dataset expressing opinion from two French politicians over time. We

therefore review state of the art NLP-based ML algorithms to automatically

annotate tweets using a manual initiation step as bootstrap. This paper focuses

on key issues about active learning while building a large annotated data set

from noise. This will be introduced by human annotators, abundance of data and

the label distribution across data and entities. In turn, we show that Twitter

characteristics such as the author’s name or hashtags can be considered as the

bearing point to not only improve automatic systems for Opinion Mining (OM) and

Topic Classification but also to reduce noise in human annotations. However, a

later thorough analysis shows that reducing noise might induce the loss of

crucial information.

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Mike Lewis , Denis Yarats , Yann N. Dauphin , Devi Parikh , Dhruv Batra Subjects : Artificial Intelligence (cs.AI) ; Computation and Language (cs.CL)

Much of human dialogue occurs in semi-cooperative settings, where agents with

different goals attempt to agree on common decisions. Negotiations require

complex communication and reasoning skills, but success is easy to measure,

making this an interesting task for AI. We gather a large dataset of

human-human negotiations on a multi-issue bargaining task, where agents who

cannot observe each other’s reward functions must reach an agreement (or a

deal) via natural language dialogue. For the first time, we show it is possible

to train end-to-end models for negotiation, which must learn both linguistic

and reasoning skills with no annotated dialogue states. We also introduce

dialogue rollouts, in which the model plans ahead by simulating possible

complete continuations of the conversation, and find that this technique

dramatically improves performance. Our code and dataset are publicly available

( this https URL ).

Distributed, Parallel, and Cluster Computing

Declarative Modeling for Building a Cloud Federation and Cloud Applications

Giuseppe Attardi , Alex Barchiesi , Alberto Colla , Fulvio Galeazzi , Giovanni Marzulli , Mario Reale Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

The paper illustrates how we built a federated cloud computing platform

dedicated to the Italian research community. Building a cloud platform is a

daunting task, that requires coordinating the deployment of many services,

interrelated and dependent on each other. Provisioning, servicing and

maintaining the platform must be automated. For our deployment, we chose a

declarative modeling tool, that allows describing the parts that compose the

system and their relations of supplier/consumer of specific interfaces. The

tool arranges the steps to bring the deployment to convergence by transforming

the state of the system until it reaches a configuration that satisfies all

constraints. We chose a declarative service modeling approach for orchestrating

both the deployment of the platform by the administrators and the deployment of

applications by users. The cloud platform has been designed so that it can be

managed by this kind of automation, facilitating the deployment of federated

regions by anyone wishing to join and to contribute resources to the

federation. Federated resources are integrated into a single cloud platform

available to any user of the federation. The federation can also seamlessly

include public clouds. We describe the architectural choices, how we adapted

the OpenStack basic facilities to the needs of a federation of multiple

independent organizations, how we control resource allocation according to

committed plans and correspondingly how we handle accounting and billing of

resource usage. Besides providing traditional IaaS services, the cloud supports

self-service deployment of cloud applications. The cloud thus addresses the

long tail of science, allowing researchers of any discipline, without expertise

in system or cloud administration, to deploy applications readily available for

their perusal.

Set-Constrained Delivery Broadcast: Definition, Abstraction Power, and Computability Limits

Damien Imbs (LIF), Achour Mostefaoui (GDD), Matthieu Perrin , Michel Raynal (ASAP, IUF)

Comments: arXiv admin note: substantial text overlap with arXiv:1702.08176

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

This paper introduces a new communication abstraction, called Set-Constrained

Delivery Broadcast (SCD-broadcast), whose aim is to provide its users with an

appropriate abstraction level when they have to implement objects or

distributed tasks in an asynchronous message-passing system prone to process

crash failures. This abstraction allows each process to broadcast messages and

deliver a sequence of sets of messages in such a way that, if a process

delivers a set of messages including a message m and later delivers a set of

messages including a message m ‘ , no process delivers first a set of messages

including m ‘ and later a set of message including m. After having presented an

algorithm implementing SCD-broadcast, the paper investigates its programming

power and its computability limits. On the “power” side it presents

SCD-broadcast-based algorithms, which are both simple and efficient, building

objects (such as snapshot and conflict-free replicated data), and distributed

tasks. On the “computability limits” side it shows that SCD-broadcast and

read/write registers are computationally equivalent.

Concurrent Geometric Multicasting

Jordan Adamek , Mikhail Nesterenko , James Robinson , Sébastien Tixeuil (NPA, IUF, LINCS) Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Computational Geometry (cs.CG); Discrete Mathematics (cs.DM); Networking and Internet Architecture (cs.NI)

We present MCFR, a multicasting concurrent face routing algorithm that uses

geometric routing to deliver a message from source to multiple targets. We

describe the algorithm’s operation, prove it correct, estimate its performance

bounds and evaluate its performance using simulation. Our estimate shows that

MCFR is the first geometric multicast routing algorithm whose message delivery

latency is independent of network size and only proportional to the distance

between the source and the targets. Our simulation indicates that MCFR has

significantly better reliability than existing algorithms.

Parameterized Verification of Algorithms for Oblivious Robots on a Ring

Arnaud Sangnier (IRIF), Nathalie Sznajder (MoVe), Maria Potop-Butucaru (NPA, LINCS), Sébastien Tixeuil (NPA, IUF, LINCS) Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS); Logic in Computer Science (cs.LO); Robotics (cs.RO)

We study verification problems for autonomous swarms of mobile robots that

self-organize and cooperate to solve global objectives. In particular, we focus

in this paper on the model proposed by Suzuki and Yamashita of anonymous robots

evolving in a discrete space with a finite number of locations (here, a ring).

A large number of algorithms have been proposed working for rings whose size is

not a priori fixed and can be hence considered as a parameter. Handmade

correctness proofs of these algorithms have been shown to be error-prone, and

recent attention had been given to the application of formal methods to

automatically prove those. Our work is the first to study the verification

problem of such algorithms in the parameter-ized case. We show that safety and

reachability problems are undecidable for robots evolving asynchronously. On

the positive side, we show that safety properties are decidable in the

synchronous case, as well as in the asynchronous case for a particular class of

algorithms. Several properties on the protocol can be decided as well. Decision

procedures rely on an encoding in Presburger arithmetics formulae that can be

verified by an SMT-solver. Feasibility of our approach is demonstrated by the

encoding of several case studies.

Distributed-Memory Parallel Algorithms for Counting and Listing Triangles in Big Graphs

Shaikh Arifuzzaman , Maleq Khan , Madhav Marathe

Comments: 30 pages

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

Big graphs (networks) arising in numerous application areas pose significant

challenges for graph analysts as these graphs grow to billions of nodes and

edges and are prohibitively large to fit in the main memory. Finding the number

of triangles in a graph is an important problem in the mining and analysis of

graphs. In this paper, we present two efficient MPI-based distributed memory

parallel algorithms for counting triangles in big graphs. The first algorithm

employs overlapping partitioning and efficient load balancing schemes to

provide a very fast parallel algorithm. The algorithm scales well to networks

with billions of nodes and can compute the exact number of triangles in a

network with 10 billion edges in 16 minutes. The second algorithm divides the

network into non-overlapping partitions leading to a space-efficient algorithm.

Our results on both artificial and real-world networks demonstrate a

significant space saving with this algorithm. We also present a novel approach

that reduces communication cost drastically leading the algorithm to both a

space- and runtime-efficient algorithm. Further, we demonstrate how our

algorithms can be used to list all triangles in a graph and compute clustering

coefficients of nodes. Our algorithm can also be adapted to a parallel

approximation algorithm using an edge sparsification method.

The Price of Anarchy in Hypergraph Coloring Games

Rann Smorodinsky , Shakhar Smorodinsky Subjects : Combinatorics (math.CO) ; Distributed, Parallel, and Cluster Computing (cs.DC); Discrete Mathematics (cs.DM); Computer Science and Game Theory (cs.GT); Social and Information Networks (cs.SI)

The price of anarchy was introduced to measure the loss incurred by a society

of agents who take actions in a decentralized manner instead of through a

central authority. Hypergraph coloring has traditionally been studied in the

context of a central designer who chooses colors. In this paper we study the

price of anarchy when the choice of color is delegated to each of the vertices

which are assumed self-interested.

Distributed Transfer Linear Support Vector Machines

Rui Zhang , Quanyan Zhu Subjects : Learning (cs.LG) ; Distributed, Parallel, and Cluster Computing (cs.DC)

Transfer learning has been developed to improve the performances of different

but related tasks in machine learning. However, such processes become less

efficient with the increase of the size of training data and the number of

tasks. Moreover, privacy can be violated as some tasks may contain sensitive

and private data, which are communicated between nodes and tasks. We propose a

consensus-based distributed transfer learning framework, where several tasks

aim to find the best linear support vector machine (SVM) classifiers in a

distributed network. With alternating direction method of multipliers, tasks

can achieve better classification accuracies more efficiently and privately, as

each node and each task train with their own data, and only decision variables

are transferred between different tasks and nodes. Numerical experiments on

MNIST datasets show that the knowledge transferred from the source tasks can be

used to decrease the risks of the target tasks that lack training data or have

unbalanced training labels. We show that the risks of the target tasks in the

nodes without the data of the source tasks can also be reduced using the

information transferred from the nodes who contain the data of the source

tasks. We also show that the target tasks can enter and leave in real-time

without rerunning the whole algorithm.

Learning

Local Feature Descriptor Learning with Adaptive Siamese Network

Chong Huang , Qiong Liu , Yan-Ying Chen , Kwang-Ting (Tim)

Cheng

Comments: 4 pages, 2 tables, 1 figure

Subjects

:

Learning (cs.LG)

; Machine Learning (stat.ML)

Although the recent progress in the deep neural network has led to the

development of learnable local feature descriptors, there is no explicit answer

for estimation of the necessary size of a neural network. Specifically, the

local feature is represented in a low dimensional space, so the neural network

should have more compact structure. The small networks required for local

feature descriptor learning may be sensitive to initial conditions and learning

parameters and more likely to become trapped in local minima. In order to

address the above problem, we introduce an adaptive pruning Siamese

Architecture based on neuron activation to learn local feature descriptors,

making the network more computationally efficient with an improved recognition

rate over more complex networks. Our experiments demonstrate that our learned

local feature descriptors outperform the state-of-art methods in patch

matching.

L2 Regularization versus Batch and Weight Normalization

Twan van Laarhoven Subjects : Learning (cs.LG) ; Machine Learning (stat.ML)

Batch Normalization is a commonly used trick to improve the training of deep

neural networks. These neural networks use L2 regularization, also called

weight decay, ostensibly to prevent overfitting. However, we show that L2

regularization has no regularizing effect when combined with normalization.

Instead, regularization has an influence on the scale of weights, and thereby

on the effective learning rate. We investigate this dependence, both in theory,

and experimentally. We show that popular optimization methods such as ADAM only

partially eliminate the influence of normalization on the learning rate. This

leads to a discussion on other ways to mitigate this issue.

Learning with Feature Evolvable Streams

Bo-Jian Hou , Lijun Zhang , Zhi-Hua Zhou Subjects : Learning (cs.LG) ; Machine Learning (stat.ML)

Learning with streaming data has attracted much attention during the past few

years. Though most studies consider data stream with fixed features, in real

practice the features may be evolvable. For example, features of data gathered

by limited-lifespan sensors will change when these sensors are substituted by

new ones. In this paper, we propose a novel learning paradigm: Feature

Evolvable Streaming Learning where old features would vanish and new features

will occur. Rather than relying on only the current features, we attempt to

recover the vanished features and exploit it to improve performance.

Specifically, we learn two models from the recovered features and the current

features, respectively. To benefit from the recovered features, we develop two

ensemble methods. In the first method, we combine the predictions from two

models and theoretically show that with assistance of old features, the

performance on new features can be improved. In the second approach, we

dynamically select the best single prediction and establish a better

performance guarantee when the best model switches. Experiments on both

synthetic and real data validate the effectiveness of our proposal.

Structured Best Arm Identification with Fixed Confidence

Ruitong Huang , Mohammad M. Ajallooeian , Csaba Szepesvári , Martin Müller Subjects : Learning (cs.LG) ; Artificial Intelligence (cs.AI)

We study the problem of identifying the best action among a set of possible

options when the value of each action is given by a mapping from a number of

noisy micro-observables in the so-called fixed confidence setting. Our main

motivation is the application to the minimax game search, which has been a

major topic of interest in artificial intelligence. In this paper we introduce

an abstract setting to clearly describe the essential properties of the

problem. While previous work only considered a two-move game tree search

problem, our abstract setting can be applied to the general minimax games where

the depth can be non-uniform and arbitrary, and transpositions are allowed. We

introduce a new algorithm (LUCB-micro) for the abstract setting, and give its

lower and upper sample complexity results. Our bounds recover some previous

results, which were only available in more limited settings, while they also

shed further light on how the structure of minimax problems influence sample

complexity.

Veiled Attributes of the Variational Autoencoder

Bin Dai , Yu Wang , John Aston , Gang Hua , David Wipf Subjects : Learning (cs.LG)

Variational autoencoders (VAE) represent a popular, flexible form of deep

generative model that can be stochastically fit to samples from a given random

process using an information-theoretic variational bound on the true underlying

distribution. Once so-obtained, the model can be putatively used to generate

new samples from this distribution, or to provide a low-dimensional latent

representation of existing samples. While quite effective in numerous

application domains, certain important mechanisms which govern the behavior of

the VAE are obfuscated by the intractable integrals and resulting stochastic

approximations involved. Moreover, as a highly non-convex model, it remains

unclear exactly how minima of the underlying energy relate to original design

purposes. We attempt to better quantify these issues by analyzing a series of

tractable special cases of increasing complexity. In doing so, we unveil

interesting connections with more traditional dimensionality reduction models,

as well as an intrinsic yet underappreciated propensity for robustly dismissing

outliers when estimating latent manifolds. With respect to the latter, we

demonstrate that the VAE can be viewed as the natural evolution of recent

robust PCA models, capable of learning nonlinear manifolds obscured by gross

corruptions. However, this previously unexplored feature comes with the cost of

potential model collapse to a degenerate distribution that may be less suitable

as the basis for generating new samples.

One Model To Learn Them All

Lukasz Kaiser , Aidan N. Gomez , Noam Shazeer , Ashish Vaswani , Niki Parmar , Llion Jones , Jakob Uszkoreit Subjects : Learning (cs.LG) ; Machine Learning (stat.ML)

Deep learning yields great results across many fields, from speech

recognition, image classification, to translation. But for each problem,

getting a deep model to work well involves research into the architecture and a

long period of tuning. We present a single model that yields good results on a

number of problems spanning multiple domains. In particular, this single model

is trained concurrently on ImageNet, multiple translation tasks, image

captioning (COCO dataset), a speech recognition corpus, and an English parsing

task. Our model architecture incorporates building blocks from multiple

domains. It contains convolutional layers, an attention mechanism, and

sparsely-gated layers. Each of these computational blocks is crucial for a

subset of the tasks we train on. Interestingly, even if a block is not crucial

for a task, we observe that adding it never hurts performance and in most cases

improves it on all tasks. We also show that tasks with less data benefit

largely from joint training with other tasks, while performance on large tasks

degrades only slightly if at all.

Deriving Compact Laws Based on Algebraic Formulation of a Data Set

Wenqing Xu , Mark Stalzer

Comments: 16 pages, 2 tables

Subjects

:

Learning (cs.LG)

In various subjects, there exist compact and consistent relationships between

input and output parameters. Discovering the relationships, or namely compact

laws, in a data set is of great interest in many fields, such as physics,

chemistry, and finance. While data discovery has made great progress in

practice thanks to the success of machine learning in recent years, the

development of analytical approaches in finding the theory behind the data is

relatively slow. In this paper, we develop an innovative approach in

discovering compact laws from a data set. By proposing a novel algebraic

equation formulation, we convert the problem of deriving meaning from data into

formulating a linear algebra model and searching for relationships that fit the

data. Rigorous proof is presented in validating the approach. The algebraic

formulation allows the search of equation candidates in an explicit

mathematical manner. Searching algorithms are also proposed for finding the

governing equations with improved efficiency. For a certain type of compact

theory, our approach assures convergence and the discovery is computationally

efficient and mathematically precise.

An Overview of Multi-Task Learning in Deep Neural Networks

Sebastian Ruder

Comments: 14 pages, 8 figures

Subjects

:

Learning (cs.LG)

; Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Multi-task learning (MTL) has led to successes in many applications of

machine learning, from natural language processing and speech recognition to

computer vision and drug discovery. This article aims to give a general

overview of MTL, particularly in deep neural networks. It introduces the two

most common methods for MTL in Deep Learning, gives an overview of the

literature, and discusses recent advances. In particular, it seeks to help ML

practitioners apply MTL by shedding light on how MTL works and providing

guidelines for choosing appropriate auxiliary tasks.

Learning Disjunctions of Predicates

Nader H. Bshouty , Dana Drachsler-Cohen , Martin Vechev , Eran Yahav Subjects : Learning (cs.LG)

Let (F) be a set of boolean functions. We present an algorithm for learning

(F_vee := {vee_{fin S} f mid S subseteq F}) from membership queries. Our

algorithm asks at most (|F| cdot OPT(F_vee)) membership queries where

(OPT(F_vee)) is the minimum worst case number of membership queries for

learning (F_vee). When (F) is a set of halfspaces over a constant dimension

space or a set of variable inequalities, our algorithm runs in polynomial time.

The problem we address has practical importance in the field of program

synthesis, where the goal is to synthesize a program that meets some

requirements. Program synthesis has become popular especially in settings

aiming to help end users. In such settings, the requirements are not provided

upfront and the synthesizer can only learn them by posing membership queries to

the end user. Our work enables such synthesizers to learn the exact

requirements while bounding the number of membership queries.

Generalization for Adaptively-chosen Estimators via Stable Median

Vitaly Feldman , Thomas Steinke

Comments: To appear in Conference on Learning Theory (COLT) 2017

Subjects

:

Learning (cs.LG)

; Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)

Datasets are often reused to perform multiple statistical analyses in an

adaptive way, in which each analysis may depend on the outcomes of previous

analyses on the same dataset. Standard statistical guarantees do not account

for these dependencies and little is known about how to provably avoid

overfitting and false discovery in the adaptive setting. We consider a natural

formalization of this problem in which the goal is to design an algorithm that,

given a limited number of i.i.d.~samples from an unknown distribution, can

answer adaptively-chosen queries about that distribution.

We present an algorithm that estimates the expectations of (k) arbitrary

adaptively-chosen real-valued estimators using a number of samples that scales

as (sqrt{k}). The answers given by our algorithm are essentially as accurate

as if fresh samples were used to evaluate each estimator. In contrast, prior

work yields error guarantees that scale with the worst-case sensitivity of each

estimator. We also give a version of our algorithm that can be used to verify

answers to such queries where the sample complexity depends logarithmically on

the number of queries (k) (as in the reusable holdout technique).

Our algorithm is based on a simple approximate median algorithm that

satisfies the strong stability guarantees of differential privacy. Our

techniques provide a new approach for analyzing the generalization guarantees

of differentially private algorithms.

A new look at clustering through the lens of deep convolutional neural networks

Ali Borji , Aysegul Dundar Subjects : Learning (cs.LG) ; Computer Vision and Pattern Recognition (cs.CV)

Classification and clustering have been studied separately in machine

learning and computer vision. Inspired by the recent success of deep learning

models in solving various vision problems (e.g., object recognition, semantic

segmentation) and the fact that humans serve as the gold standard in assessing

clustering algorithms, here, we advocate for a unified treatment of the two

problems and suggest that hierarchical frameworks that progressively build

complex patterns on top of the simpler ones (e.g., convolutional neural

networks) offer a promising solution. We do not dwell much on the learning

mechanisms in these frameworks as they are still a matter of debate, with

respect to biological constraints. Instead, we emphasize on the

compositionality of the real world structures and objects. In particular, we

show that CNNs, trained end to end using back propagation with noisy labels,

are able to cluster data points belonging to several overlapping shapes, and do

so much better than the state of the art algorithms. The main takeaway lesson

from our study is that mechanisms of human vision, particularly the hierarchal

organization of the visual ventral stream should be taken into account in

clustering algorithms (e.g., for learning representations in an unsupervised

manner or with minimum supervision) to reach human level clustering

performance. This, by no means, suggests that other methods do not hold merits.

For example, methods relying on pairwise affinities (e.g., spectral clustering)

have been very successful in many cases but still fail in some cases (e.g.,

overlapping clusters).

Distributed Transfer Linear Support Vector Machines

Rui Zhang , Quanyan Zhu Subjects : Learning (cs.LG) ; Distributed, Parallel, and Cluster Computing (cs.DC)

Transfer learning has been developed to improve the performances of different

but related tasks in machine learning. However, such processes become less

efficient with the increase of the size of training data and the number of

tasks. Moreover, privacy can be violated as some tasks may contain sensitive

and private data, which are communicated between nodes and tasks. We propose a

consensus-based distributed transfer learning framework, where several tasks

aim to find the best linear support vector machine (SVM) classifiers in a

distributed network. With alternating direction method of multipliers, tasks

can achieve better classification accuracies more efficiently and privately, as

each node and each task train with their own data, and only decision variables

are transferred between different tasks and nodes. Numerical experiments on

MNIST datasets show that the knowledge transferred from the source tasks can be

used to decrease the risks of the target tasks that lack training data or have

unbalanced training labels. We show that the risks of the target tasks in the

nodes without the data of the source tasks can also be reduced using the

information transferred from the nodes who contain the data of the source

tasks. We also show that the target tasks can enter and leave in real-time

without rerunning the whole algorithm.

Biased Bagging for Unsupervised Domain Adaptation

Twan van Laarhoven , Elena Marchiori

Comments: Source code to reproduce our experiments is available at this http URL

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG)

Unsupervised domain adaptation (DA) is an active research area whose

development and applications have been boosted by the explosion of data without

annotation. Manual labeling is tedious, error prone and expensive, therefore

unsupervised DA is needed more and more to automatize this task: an unlabeled

dataset (target) is annotated using a labeled dataset (source) from a related

domain. The majority of DA methods try to directly match the distributions of

the source and target data. Nevertheless, recent DA methods still suffer from

issues such as the incapability to scale to high dimensional data or the

sensitivity to the (hyper-)parameters of the adaptation procedure.

We propose to overcome these issues by using bagging. The main idea is to

directly embed a given source hypothesis inside a bagging procedure in order to

generate a sequence of good target hypotheses related to the source, which are

then used in a voting method for predicting labels of target data. Our method

is extremely easy to implement and apply. Its effectiveness is demonstrated by

a recent theoretical study on the generalization performance of voting methods

in terms of large mean and low variance of the margin distribution. A

qualitative analysis of our method shows that it tends to increase the mean and

to reduce the variance of the target margin distribution, which is beneficial

for generalization. We report state-of-the-art performance on benchmark

datasets for adaptation tasks using text and image datasets with diverse

characteristics, such as high number of features, large number of classes, and

based on deep input features.

Topic supervised non-negative matrix factorization

Kelsey MacMillan , James D. Wilson

Comments: 20 pages, 7 figures

Subjects

:

Computation and Language (cs.CL)

; Information Retrieval (cs.IR); Learning (cs.LG); Machine Learning (stat.ML)

Topic models have been extensively used to organize and interpret the

contents of large, unstructured corpora of text documents. Although topic

models often perform well on traditional training vs. test set evaluations, it

is often the case that the results of a topic model do not align with human

interpretation. This interpretability fallacy is largely due to the

unsupervised nature of topic models, which prohibits any user guidance on the

results of a model. In this paper, we introduce a semi-supervised method called

topic supervised non-negative matrix factorization (TS-NMF) that enables the

user to provide labeled example documents to promote the discovery of more

meaningful semantic structure of a corpus. In this way, the results of TS-NMF

better match the intuition and desired labeling of the user. The core of TS-NMF

relies on solving a non-convex optimization problem for which we derive an

iterative algorithm that is shown to be monotonic and convergent to a local

optimum. We demonstrate the practical utility of TS-NMF on the Reuters and

PubMed corpora, and find that TS-NMF is especially useful for conceptual or

broad topics, where topic key terms are not well understood. Although

identifying an optimal latent structure for the data is not a primary objective

of the proposed approach, we find that TS-NMF achieves higher weighted Jaccard

similarity scores than the contemporary methods, (unsupervised) NMF and latent

Dirichlet allocation, at supervision rates as low as 10% to 20%.

Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme

Suncong Zheng , Feng Wang , Hongyun Bao , Yuexing Hao , Peng Zhou , Bo Xu Subjects : Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Learning (cs.LG)

Joint extraction of entities and relations is an important task in

information extraction. To tackle this problem, we firstly propose a novel

tagging scheme that can convert the joint extraction task to a tagging problem.

Then, based on our tagging scheme, we study different end-to-end models to

extract entities and their relations directly, without identifying entities and

relations separately. We conduct experiments on a public dataset produced by

distant supervision method and the experimental results show that the tagging

based methods are better than most of the existing pipelined and joint learning

methods. What’s more, the end-to-end model proposed in this paper, achieves the

best results on the public dataset.

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

Junhyuk Oh , Satinder Singh , Honglak Lee , Pushmeet Kohli

Comments: ICML 2017

Subjects

:

Artificial Intelligence (cs.AI)

; Learning (cs.LG)

As a step towards developing zero-shot task generalization capabilities in

reinforcement learning (RL), we introduce a new RL problem where the agent

should learn to execute sequences of instructions after learning useful skills

that solve subtasks. In this problem, we consider two types of generalizations:

to previously unseen instructions and to longer sequences of instructions. For

generalization over unseen instructions, we propose a new objective which

encourages learning correspondences between similar subtasks by making

analogies. For generalization over sequential instructions, we present a

hierarchical architecture where a meta controller learns to use the acquired

skills for executing the instructions. To deal with delayed reward, we propose

a new neural architecture in the meta controller that learns when to update the

subtask, which makes learning more efficient. Experimental results on a

stochastic 3D domain show that the proposed ideas are crucial for

generalization to longer instructions as well as unseen instructions.

Gradient Descent for Spiking Neural Networks

Dongsung Huh , Terrence J. Sejnowski Subjects : Neurons and Cognition (q-bio.NC) ; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Much of studies on neural computation are based on network models of static

neurons that produce analog output, despite the fact that information

processing in the brain is predominantly carried out by dynamic neurons that

produce discrete pulses called spikes. Research in spike-based computation has

been impeded by the lack of efficient supervised learning algorithm for spiking

networks. Here, we present a gradient descent method for optimizing spiking

network models by introducing a differentiable formulation of spiking networks

and deriving the exact gradient calculation. For demonstration, we trained

recurrent spiking networks on two dynamic tasks: one that requires optimizing

fast (~millisecond) spike-based interactions for efficient encoding of

information, and a delayed memory XOR task over extended duration (~second).

The results show that our method indeed optimizes the spiking network dynamics

on the time scale of individual spikes as well as behavioral time scales. In

conclusion, our result offers a general purpose supervised learning algorithm

for spiking neural networks, thus advancing further investigations on

spike-based computation.

Information Theory

A Survey on Non-Orthogonal Multiple Access for 5G Networks: Research Challenges and Future Trends

Zhiguo Ding , Xianfu Lei , George K. Karagiannidis , Robert Schober , Jihong Yuan , Vijay Bhargava

Comments: to appear in IEEE JSAC, 2017

Subjects

:

Information Theory (cs.IT)

Non-orthogonal multiple access (NOMA) is an essential enabling technology for

the fifth generation (5G) wireless networks to meet the heterogeneous demands

on low latency, high reliability, massive connectivity, improved fairness, and

high throughput. The key idea behind NOMA is to serve multiple users in the

same resource block, such as a time slot, subcarrier, or spreading code. The

NOMA principle is a general framework, and several recently proposed 5G

multiple access schemes can be viewed as special cases. This survey provides an

overview of the latest NOMA research and innovations as well as their

applications. Thereby, the papers published in this special issue are put into

the content of the existing literature. Future research challenges regarding

NOMA in 5G and beyond are also discussed.

NOMA in Distributed Antenna System for Max-Min Fairness and Max-Sum-Rate

Dong-Jun Han , Minseok Choi , Jaekyun Moon Subjects : Information Theory (cs.IT)

Distributed antenna system (DAS) has been deployed for over a decade. DAS has

advantages in capacity especially for the cell edge users, in both single-cell

and multi-cell environments. In this paper, non-orthogonal multiple access

(NOMA) is suggested in single-cell DAS to maximize user fairness and sumrate.

Two transmission strategies are considered: NOMA with single selection

transmission and NOMA with blanket transmission. In a two-user scenario, the

center base station (BS) uses NOMA to serve both users and the remote radio

units (RRUs) serve only the weak user based on the transmission scheme. The

signals sent from the RRUs are not considered as interference at both user

sides. At one side, the signals from the RRUs are used to detect the required

data and at the other side, the signals are used to perform successive

interference cancellation (SIC). The max-min performance is studied when

instantaneous channel gain information (CGI) or channel distribution

information (CDI) is known at the transmitter. A closed-form expression of the

upper bound is derived for the data rate of each user with only CDI. In

addition, the sum-rate is studied with a minimum rate constraint when

instantaneous CGI known at the transmitter. The results show that NOMA with

blanket transmission gives the best performance compared to other schemes;

conventional-NOMA, conventional single selection scheme and joint-transmission

(JT) NOMA. It is also shown that with less transmit power, NOMA with single

selection transmission gives better performance than conventional-NOMA.

Precoded Chebyshev-NLMS based pre-distorter for nonlinear LED compensation in NOMA-VLC

Rangeet Mitra , Vimal Bhatia

Comments: R. Mitra and V. Bhatia are with Indian Institute of Technology Indore, Indore-453552, India, Email:phd1301202010@iiti.ac.in, vbhatia@iiti.ac.in. This work was submitted to IEEE Transactions on Communications on October 26, 2016, decisioned on March 3, 2017, and revised on April 25, 2017, and is currently under review in IEEE Transactions on Communications

Subjects

:

Information Theory (cs.IT)

Visible light communication (VLC) is one of the main technologies driving the

future 5G communication systems due to its ability to support high data rates

with low power consumption, thereby facilitating high speed green

communications. To further increase the capacity of VLC systems, a technique

called non-orthogonal multiple access (NOMA) has been suggested to cater to

increasing demand for bandwidth, whereby users’ signals are superimposed prior

to transmission and detected at each user equipment using successive

interference cancellation (SIC). Some recent results on NOMA exist which

greatly enhance the achievable capacity as compared to orthogonal multiple

access techniques. However, one of the performance-limiting factors affecting

VLC systems is the nonlinear characteristics of a light emitting diode (LED).

This paper considers the nonlinear LED characteristics in the design of

pre-distorter for cognitive radio inspired NOMA in VLC, and proposes singular

value decomposition based Chebyshev precoding to improve performance of

nonlinear multiple-input multiple output NOMA-VLC. A novel and generalized

power allocation strategy is also derived in this work, which is valid even in

scenarios when users experience similar channels. Additionally, in this work,

analytical upper bounds for the bit error rate of the proposed detector are

derived for square (M)-quadrature amplitude modulation.

Sparsity Order Estimation from a Single Compressed Observation Vector

Sebastian Semper , Florian Römer , Thomas Hotz , Giovanni DelGaldo Subjects : Information Theory (cs.IT)

We investigate the problem of estimating the unknown degree of sparsity from

compressive measurements without the need to carry out a sparse recovery step.

While the sparsity order can be directly inferred from the effective rank of

the observation matrix in the multiple snapshot case, this appears to be

impossible in the more challenging single snapshot case. We show that specially

designed measurement matrices allow to rearrange the measurement vector into a

matrix such that its effective rank coincides with the effective sparsity

order. In fact, we prove that matrices which are composed of a Khatri-Rao

product of smaller matrices generate measurements that allow to infer the

sparsity order. Moreover, if some samples are used more than once, one of the

matrices needs to be Vandermonde. These structural constraints reduce the

degrees of freedom in choosing the measurement matrix which may incur in a

degradation in the achievable coherence. We thus also address suitable choices

of the measurement matrices. In particular, we analyze Khatri-Rao and

Vandermonde matrices in terms of their coherence and provide a new design for

Vandermonde matrices that achieves a low coherence.

Wireless Link Capacity under Shadowing and Fading

Magnus M. Halldorsson , Tigran Tonoyan

Comments: 20 pages. In Mobihoc’17

Subjects

:

Information Theory (cs.IT)

; Networking and Internet Architecture (cs.NI)

We consider the following basic link capacity (a.k.a., one-shot scheduling)

problem in wireless networks: Given a set of communication links, find a

maximum subset of links that can successfully transmit simultaneously. Good

performance guarantees are known only for deterministic models, such as the

physical model with geometric (log-distance) pathloss. We treat this problem

under stochastic shadowing under general distributions, bound the effects of

shadowing on optimal capacity, and derive constant approximation algorithms. We

also consider temporal fading under Rayleigh distribution, and show that it

affects non-fading solutions only by a constant-factor. These can be combined

into a constant approximation link capacity algorithm under both time-invariant

shadowing and temporal fading.

Successive Cancellation Decoding of Single Parity-Check Product Codes

Mustafa Cemil Coşkun , Gianluigi Liva , Alexandre Graell i Amat , Michael Lentmaier Subjects : Information Theory (cs.IT)

We introduce successive cancellation (SC) decoding of product codes (PCs)

with single parity-check (SPC) component codes. Recursive formulas are derived,

which resemble the SC decoding algorithm of polar codes. We analyze the error

probability of SPC-PCs over the binary erasure channel under SC decoding. A

bridge with the analysis of PCs introduced by Elias in 1954 is also

established. Furthermore, bounds on the block error probability under SC

decoding are provided, and compared to the bounds under the original decoding

algorithm proposed by Elias. It is shown that SC decoding of SPC-PCs achieves a

lower block error probability than Elias’ decoding.

Conditions for Unique Reconstruction of Sparse Signals Using Compressive Sensing Methods

Ljubisa Stankovic , Milos Dakovic , Srdjan Stankovic , Irena Orovic

Comments: 32 pages, 3 figures, 4 MATLAB programs

Subjects

:

Information Theory (cs.IT)

A signal is sparse in one of its representation domain if the number of

nonzero coefficients in that domain is much smaller than the total number of

coefficients. Sparse signals can be reconstructed from a very reduced set of

measurements/observations. The topic of this paper are conditions for the

unique reconstruction of sparse signals from a reduced set of observations.

After the basic definitions are introduced, the unique reconstruction

conditions are reviewed using the spark, restricted isometry, and coherence of

the measurement matrix. Uniqueness of the reconstruction of signals sparse in

the discrete Fourier domain (DFT), as the most important signal transformation

domain, is considered as well.

Base Station Selection for Massive MIMO Networks with Two-stage Precoding

Jianpeng Ma , Shun Zhang , Hongyan Li , Nan Zhao , Victor C.M. Leung

Comments: Submitted to IEEE Wireless Communications Letters

Subjects

:

Information Theory (cs.IT)

The two-stage precoding has been proposed to reduce the overhead of both the

channel training and the channel state information (CSI) feedback for the

massive multiple-input multiple-output (MIMO) system. But the overlap of the

angle-spreading-ranges (ASR) for different user clusters may seriously degrade

the performance of the two-stage precoding. In this letter, we propose one ASR

overlap mitigating scheme through the base station (BS) selection. Firstly, the

BS selection is formulated as a sum signal-to-interference-plus-noise ratio

(SINR) maximization problem. Then, the problem is solved by a low-complex

algorithm through maximizing signal-to-leakage-plus-noise ratio (SLNR). In

addition, we propose one low-overhead algorithm with the lower bound on the

average SLNR as the objective function. Finally, we demonstrate the efficacy of

the proposed schemes through the numerical simulations.

Interference-Alignment and Soft-Space-Reuse Based Cooperative Transmission for Multi-cell Massive MIMO Networks

Jianpeng Ma , Shun Zhang , Hongyan Li , Nan Zhao , Victor C.M. Leung

Comments: Submitted to IEEE Transactions on Wireless Communications

Subjects

:

Information Theory (cs.IT)

As a revolutionary wireless transmission strategy, interference alignment

(IA) can improve the capacity of the cell-edge users. However, the acquisition

of the global channel state information (CSI) for IA leads to unacceptable

overhead in the massive MIMO systems. To tackle this problem, in this paper, we

propose an IA and soft-space-reuse (IA-SSR) based cooperative transmission

scheme under the two-stage precoding framework. Specifically, the cell-center

and the cell-edge users are separately treated to fully exploit the spatial

degrees of freedoms (DoF). Then, the optimal power allocation policy is

developed to maximize the sum-capacity of the network. Next, a low-cost channel

estimator is designed for the proposed IA-SSR framework. Some practical issues

in IA-SSR implementation are also discussed. Finally, plenty of numerical

results are presented to show the efficiency of the proposed algorithm.

Spectral Domain Sampling of Graph Signals

Yuichi Tanaka

Comments: Submitted to IEEE Transactions on Signal Processing

Subjects

:

Information Theory (cs.IT)

Sampling methods for graph signals in the graph spectral domain are proposed.

Conventional sampling of graph signals can be regarded as sampling in the graph

vertex domain, but it does not have desired characteristics in regard to the

graph spectral domain. Down- and upsampled graph signals by using our methods

inherit the frequency domain characteristics of sampled signals defined in the

time/spatial domain. Properties of the sampling effects are studied

theoretically and experimentally in comparison with the conventional sampling

method in the vertex domain. Fractional sampling and Laplacian pyramid of graph

signals are also shown as possible applications.

On M-ary Distributed Detection for Power Constraint Wireless Sensor Networks

Zahra Hajibabaei , Jalil Modares , Azadeh Vosoughi

Comments: arXiv admin note: text overlap with arXiv:cs/0703046 by other authors

Subjects

:

Information Theory (cs.IT)

We consider a wireless sensor network (WSN), consisting of several sensors

and a fusion center (FC), which is tasked with solving an M-ary hypothesis

testing problem. Sensors make M-ary decisions and transmit their digitally

modulated decisions over orthogonal channels, which are subject to Rayleigh

fading and noise, to the FC. Adopting Bayesian optimality criterion, we

consider training and non-training based distributed detection systems and

investigate the effect of imperfect channel state information (CSI) on the

optimal maximum a posteriori probability (MAP) fusion rules and optimal power

allocation between sensors, when the sum of training and data symbol transmit

powers is fixed. We consider J-divergence criteria to do power allocation

between sensors. The theoretical results show that J-divergence for coherent

reception will be maximized if total training power be half of total power,

however for non coherent reception, optimal training power which maximize

J-divergence is zero. The simulated results also show that probability of error

will be minimized if training power be half of total power for coherent

reception and zero for non coherent reception.

Towards a Realistic Assessment of Multiple Antenna HCNs: Residual Additive Transceiver Hardware Impairments and Channel Aging

Anastasios Papazafeiropoulos , Tharm Ratnarajah

Comments: accepted in IEEE TVT

Subjects

:

Information Theory (cs.IT)

Given the critical dependence of broadcast channels by the accuracy of

channel state information at the transmitter (CSIT), we develop a general

downlink model with zero-forcing (ZF) precoding, applied in realistic

heterogeneous cellular systems with multiple antenna base stations (BSs).

Specifically, we take into consideration imperfect CSIT due to pilot

contamination, channel aging due to users relative movement, and unavoidable

residual additive transceiver hardware impairments (RATHIs). Assuming that the

BSs are Poisson distributed, the main contributions focus on the derivations of

the upper bound of the coverage probability and the achievable user rate for

this general model. We show that both the coverage probability and the user

rate are dependent on the imperfect CSIT and RATHIs. More concretely, we

quantify the resultant performance loss of the network due to these effects. We

depict that the uplink RATHIs have equal impact, but the downlink transmit BS

distortion has a greater impact than the receive hardware impairment of the

user. Thus, the transmit BS hardware should be of better quality than user’s

receive hardware. Furthermore, we characterise both the coverage probability

and user rate in terms of the time variation of the channel. It is shown that

both of them decrease with increasing user mobility, but after a specific value

of the normalised Doppler shift, they increase again. Actually, the time

variation, following the Jakes autocorrelation function, mirrors this effect on

coverage probability and user rate. Finally, we consider space division

multiple access (SDMA), single user beamforming (SU-BF), and baseline

single-input single-output (SISO) transmission. A comparison among these

schemes reveals that the coverage by means of SU-BF outperforms SDMA in terms

of coverage.

Phaseless Reconstruction from Space-Time Samples

Akram Aldroubi , llya krishtal , Sui Tang

Comments: 23 pages, 4 figures

Subjects

:

Functional Analysis (math.FA)

; Information Theory (cs.IT)

Phaseless reconstruction from space-time samples is a nonlinear problem of

recovering a function (x) in a Hilbert space (mathcal{H}) from the modulus of

linear measurements ({lvert langle x, phi_i

angle

vert), ( ldots),

(lvert langle A^{L_i}x, phi_i

angle

vert : i inmathscr I}), where

({phi_i; i inmathscr I}subset mathcal{H}) is a set of functionals on

(mathcal{H}), and (A) is a bounded operator on (mathcal{H}) that acts as an

evolution operator. In this paper, we provide various sufficient or necessary

conditions for solving this problem, which has connections to (X)-ray

crystallography, the scattering transform, and deep learning.

Nonbacktracking Bounds on the Influence in Independent Cascade Models

Emmanuel Abbe , Sanjeev Kulkarni , Eun Jee Lee Subjects : Social and Information Networks (cs.SI) ; Computational Complexity (cs.CC); Information Theory (cs.IT); Probability (math.PR); Machine Learning (stat.ML)

This paper develops upper and lower bounds on the influence measure in a

network, more precisely, the expected number of nodes that a seed set can

influence in the independent cascade model. In particular, our bounds exploit

nonbacktracking walks, Fortuin-Kasteleyn-Ginibre (FKG) type inequalities, and

are computed by message passing implementation. Nonbacktracking walks have

recently allowed for headways in community detection, and this paper shows that

their use can also impact the influence computation. Further, we provide a knob

to control the trade-off between the efficiency and the accuracy of the bounds.

Finally, the tightness of the bounds is illustrated with simulations on various

network models.

Spatial Coding Techniques for Molecular MIMO

Martin Damrath , H. Birkan Yilmaz , Chan-Byoung Chae , Peter Adam Hoeher Subjects : Emerging Technologies (cs.ET) ; Information Theory (cs.IT)

This paper studies spatial diversity techniques applied to multiple-input

multiple-output (MIMO) diffusion-based molecular communications (DBMC). Two

types of spatial coding techniques, namely Alamouti-type coding and repetition

MIMO coding are suggested and analyzed. In addition, we consider receiver-side

equal-gain combining, which is equivalent to maximum-ratio combining in

symmetrical scenarios. For numerical analysis, the channel impulse responses of

a symmetrical (2 imes 2) MIMO-DBMC system are acquired by a trained

artificial neural network. It is demonstrated that spatial diversity has the

potential to improve the system performance and that repetition MIMO coding

outperforms Alamouti-type coding.

欢迎加入我爱机器学习QQ11群:191401275

arXiv Paper Daily: Mon, 19 Jun 2017

微信扫一扫,关注我爱机器学习公众号

微博:我爱机器学习


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Python标准库

Python标准库

Doug Hellmann / 刘炽 / 机械工业出版社华章公司 / 2012-6-15 / 139.00元

本书由资深Python专家亲自执笔,Python语言的核心开发人员作序推荐,权威性毋庸置疑。 对于程序员而言,标准库与语言本身同样重要,它好比一个百宝箱,能为各种常见的任务提供完美的解决方案,所以本书是所有Python程序员都必备的工具书!本书以案例驱动的方式讲解了标准库中一百多个模块的使用方法(如何工作)和工作原理(为什么要这样工作),比标准库的官方文档更容易理解(一个简单的示例比一份手册......一起来看看 《Python标准库》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

MD5 加密
MD5 加密

MD5 加密工具

html转js在线工具
html转js在线工具

html转js在线工具