DeepPrivacy: A Generative Adversarial Network for Face Anonymization We propose a novel architecture which is able to automatically anonymize faces in images while retaining the original data distribution. We ensure total anonymization of all faces in an image by generating images exclusively on privacy-safe information. [arXiv:1909.04538]
Song Hit Prediction: Predicting Billboard Hits Using Spotify Data In this work, we attempt to solve the Hit Song Science problem, which aims to predict which songs will become chart-topping hits. We constructed a dataset with approximately 1.8 million hit and non-hit songs and extracted their audio features using the Spotify Web API. [arXiv:1908.08609]
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. [arXiv:1909.03186]
Geometry-Aware Video Object Detection for Static Cameras In this paper we propose a geometry-aware model for video object detection. Specifically, we consider the setting that cameras can be well approximated as static, e.g. in video surveillance scenarios, and scene pseudo depth maps can therefore be inferred easily from the object scale on the image plane. [arXiv:1909.03140 ]
“Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding Understanding time is crucial for understanding events expressed in natural language. Because people rarely say the obvious, it is often necessary to have commonsense knowledge about various temporal aspects of events, such as duration, frequency, and temporal order. [arXiv:1909.03065]
Clickbait? Sensational Headline Generation with Auto-tuned Reinforcement Learning Sensational headlines are headlines that capture people’s attention and generate reader interest. Conventional abstractive headline generation methods, unlike human writers, do not optimize for maximal reader attention. In this paper, we propose a model that generates sensational headlines without labeled data. [arXiv:1909.03582v1]
An Acceleration Framework for High Resolution Image Synthesis Synthesis of high resolution images using Generative Adversarial Networks (GANs) is challenging, which usually requires numbers of high-end graphic cards with large memory and long time of training. In this paper, we propose a two-stage framework to accelerate the training process of synthesizing high resolution images. High resolution images are first transformed to small codes via the trained encoder and decoder networks. [arXiv:1909.03611v1]
From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam’s non-diagram, multiple choice (NDMC) questions. In addition, our Aristo system, building upon the success of recent language models, exceeded 83% on the corresponding Grade 12 Science Exam NDMC questions. [arXiv:1909.01958v]
Neural Style-Preserving Visual Dubbing Dubbing is a technique for translating video content from one language to another. However, state-of-the-art visual dubbing techniques directly copy facial expressions from source to target actors without considering identity-specific idiosyncrasies such as a unique type of smile. We present a style-preserving visual dubbing approach from single video inputs, which maintains the signature style of target actors when modifying facial expressions, including mouth motions, to match foreign languages [arXiv:1909.01958v1]