Gdmm Assignment 2013 Honda

NIPS 2016

Conference At a Glance MONDAY DECEMBER 5TH Tutorial 1 Coffee break Tutorial 2 Lunch on your own. Tutorial 3 Coffee break Opening Remarks Invited talk, Yann LeCun (Facebook & NYU) Reception and Posters

8:30 - 10:30 am 10:30 - 11:00 am 11:00 - 1:00 pm 1:00 - 2:30 pm 2:30 - 4:30 pm 4:30 - 5:00 pm 5:00 - 5:30 pm 5:30 - 6:20 pm 6:30 - 9:30 pm

TUESDAY DECEMBER 6TH Drew Purves (DeepMind) Award talk Coffee break Parallel tracks: Clustering/Graphical Models Lunch on your own and poster viewing Saket Navlakha (Salk Institute) Coffee break Parallel tracks: Deep Learning/Theory Poster session & Demonstrations

9:00 - 9:50 am 9:50 - 10:10 am 10:10 - 10:40 am 10:40 - 12:20 pm 12:20 - 3:00 pm 3:00 - 3:50 pm 3:50 - 4:20 pm 4:20 - 6:00 pm 6:00 - 9:30 pm

WEDNESDAY DECEMBER 7TH Kyle Cranmer (NYU) Award talk Coffee break Parallel tracks: Algorithms/Applications Lunch on your own and poster viewing Marc Raibert (Boston Dynamics) Coffee Break Parallel tracks: Deep Learning /Optimization Poster session & Demonstrations

Teams & Committees

2

Sponsors 3 Exhibitors 7 Letter From The President

8

Upcoming Conferences

8

Area Map & General Info

9

Conference Maps

10

Sponsor Maps

11

Monday Tutorial Sessions

12

Monday Poster Sessions

16

Tuesday Sessions & Talks

54

Tuesday Poster Sessions

60

Tuesday Demonstrations

101

Wednesday Sessions & Talks 103 9:00 - 9:50 am 9:50 - 10:10 am 10:10 - 10:40 am 10:40 - 12:20pm 12:20 - 3:00 pm 3:00 - 3:50 pm 3:50 - 4:20 pm 4:20 - 6:00 pm 6:00 - 9:30 pm

Wednesday Poster Sessions

109

Wednesday Demonstrations 151 Thursday Sessions & Talks

153

Symposium 156 Workshops (Fri & Sat)

157

Reviewers 159 Author Index

THURSDAY DECEMBER 8TH Irina Rish (IBM) Susan Holmes (Stanford) Coffee break Parallel tracks: Interpretable Models/Neuroscience/Cognitive Lunch on your own Symposia coffee break Symposia Light dinner Symposia

Contents

163

9:00 - 9:50 am 9:50 - 10:40 am 10:40 - 11:10 am 11:10 - 12:20 pm 12:20 - 2:00 pm 2:00 - 4:00 pm 4:00 - 4:30 pm 4:30 - 6:30 pm 6:30 - 7:30 pm 7:30 - 9:30 pm

FRIDAY & SATURDAY DECEMBER 9TH & 10TH Workshop Sessions 8 am - 6:30 pm Check workshop schedules for actual start times Coffee break 10:30 – 11:00 am Coffee break 3:00 – 3:30 pm 1

Organizing Committee General Chairs: Daniel D Lee (U. of Pennsylvania) Masashi Sugiyama (U. of Tokyo) Program Chairs: Ulrike von Luxburg (U. of Tübingen) Isabelle Guyon (Clopinet) Tutorials Chair: Joelle Pineau (McGill U.) Hanna Wallach (Microsoft)

Workshop Chairs: Ralf Herbrich (Amazon) Demonstration Chair: Raia Hadsell (DeepMind) Publications Chair and Electronic Proceedings Chair Roman Garnett (Washington U. St. Louis) Program Managers: Krikamol Muandet (Mahidol U. and MPI) Behzad Tabibian (MPI)

NIPS Foundation Officers & Board Members President Terrence Sejnowski, The Salk Institute Treasurer Marian Stewart Bartlett, Apple Secretary Michael Mozer, UC Boulder Executive DIRECTOR Mary Ellen Perry, The Salk Institute Legal Advisor David Kirkpatrick IT Director Lee Campbell, The Salk Institute Executive Board Zoubin Ghahramani, Univ. of Cambridge Corinna Cortes, Google Research Léon Bottou, Microsoft Research Chris J.C. Burges, Microsoft Research Neil D. Lawerence, Univ. of Sheffield Fernando Pereira, Google Research Max Welling, Univ. of Amsterdam

Advisory Board Peter Bartlett, Queensland Univ., UC Berkley Sue Becker, McMaster Univ., Ontario, Canada Yoshua Bengio, Univ. of Montreal, Canada Jack Cowan, Univ. of Chicago Thomas G. Dietterich, Oregon State Univ. Stephen Hanson, Rutgers Univ. Michael I. Jordan, Univ. of California, Berkeley Michael Kearns, Univ. of Pennsylvania Scott Kirkpatrick, Hebrew Univ., Jerusalem Daphne Koller, Stanford Univ. John Lafferty, Univ. of Chicago Todd K. Leen, Oregon Health & Sciences Univ. Richard Lippmann, MIT Bartlett Mel, Univ. of Southern California John Moody, UC Berkeley & Portland John C. Platt, Google Gerald Tesauro, IBM Watson Labs Sebastian Thrun, Stanford Univ. Dave Touretzky, Carnegie Mellon Univ. Lawrence Saul, Univ. of California, San Diego Bernhard Schölkopf, MP, Tübingen/Stuttgart Dale Schuurmans, Univ. of Alberta, Canada John Shawe-Taylor, Univ. College London Sara A. Solla, Northwestern Univ. Med. School Yair Weiss, Hebrew Univ. of Jerusalem Chris Williams, Univ. of Edinburgh Rich Zemel, Univ. of Toronto

CORE LOGISTICS TEAM The organization and management of NIPS would not be possible without the help of many volunteers, students, researchers and administrators who donate their valuable time and energy to assist the conference in various ways. However, there is a core team at the Salk Institute whose tireless efforts make the conference run smoothly and efficiently every year. This year, NIPS would particularly like to acknowledge the exceptional work of: Lee Campbell - IT Director Mary Ellen Perry - Executive Dir. Terrance Gaines - Administrator Susan Perry - Volunteer Mgr. Jen Perry - Administrator Ramona Marchand - Administrator

Program Committee Emmanuel Abbe (Princeton Univ.) Alekh Agarwal (Microsoft) Anima Anandkumar (UC Irvine) Chloé-Agathe Azencott (MINES ParisTech) Shai Ben-David (Univ. Waterloo) Alina Beygelzimer (Yahoo Research) Jeff Bilmes (Univ. of Washington (Seattle) Gilles Blanchard (Univ. of Potsdam) Matthew Blaschko (KU Leuven) Tamara Broderick (MIT) Sebastien Bubeck (Princeton) Alexandra Carpentier (Univ. Potsdam) Miguel Carreira-Perpinan (UC Merced) Kamalika Chaudhuri (UC San Diego) Gal Chechik (Google (Bar-Ilan Univ.) Kyunghyun Cho (New York Univ.) Aaron Courville (Univ. of Montreal) Koby Crammer (Technion) Florence d’Alché-Buc (Telecom Paris Tech) Arnak Dalalyan (ENSAE ParisTech) Marc Deisenroth (Imperial College London) Francesco Dinuzzo (Amazon) Finale Doshi-Velez (Harvard) Ran El-Yaniv (Technion) Hugo Jair Escalante (INAOE)

2

Sergio Escalera (Univ. of Barcelona) Maryam Fazel (Univ. of Washington) Aasa Feragen (Univ. of Copenhagen) Rob Fergus (New York Univ.) Xiaoli Fern (Oregon State Univ.) Francois Fleuret (Idiap Research Institute) Surya Ganguli (Stanford) Peter Gehler (Univ. of Tübingen) Claudio Gentile (DiSTA (Universita dell’Insubria) Lise Getoor (UC Santa Cruz) Mark Girolami (Imperial College London) Amir Globerson (Tel Aviv Univ.) Yoav Goldberg (Bar Ilan Univ.) Manuel Gomez (Max Planck Institute) Yves Grandvalet (Univ. of Compiègne & CNRS) Moritz Grosse-Wentrup (MPI) Zaid Harchaoui (Univ. of Washington) Moritz Hardt (Google) Matthias Hein (Saarland Univ.) Philipp Hennig (MPI IS Tübingen) Frank Hutter (Univ. of Freiburg) Prateek Jain (Microsoft Research) Navdeep Jaitly (Google Brain) Stefanie Jegelka (MIT) Samuel Kaski (Aalto Univ.) Koray Kavukcuoglu (DeepMind)

Jens Kober (TU Delft) Samory Kpotufe (Princeton Univ.) Sanjiv Kumar (Google Research) James Kwok (Hong Kong Univ.) Simon Lacoste-Julien (U. of Montreal) Christoph Lampert (IST Austria) Hugo Larochelle (Twitter) Francois Laviolette (L’Université Laval) Honglak Lee (Univ. of Michigan) Christoph Lippert (Human Longevity) Po-Ling Loh (UW-Madison) Phil Long (Sentient Technologies) Jakob Macke (Caesar Bonn) Julien Mairal (Inria) Shie Mannor (Technion) Marina Meila (Univ. of Washington) Claire Monteleoni (George) Washington Univ.) Remi Munos (DeepMind) Guillaume Obozinski (Ecole Paris) Cheng Soon Ong (Data61 and ANU) Francesco Orabona (Stony Brook U.) Fernando Perez-Cruz (Universidad) Carlos III de Madrid (Bell Labs (Nokia)) Jonathan Pillow (Princeton Univ.) Doina Precup (McGill Montreal) Alain Rakotomamonjy (Univ. of Rouen)

Manuel Rodriguez (Max Planck Inst.) Rómer Rosales (Linkedin) Lorenzo Rosasco (U. of Genova (MIT) Sivan Sabato (Ben-Gurion Univ.) Mehreen Saeed (FAST (Univ of CES) Ruslan Salakhutdinov (CMU) Purnamrita Sarkar (Univ. T. Austin) Fei Sha (USC) Ohad Shamir Weizmann (Inst of Science) Jonathon Shlens (Google Brain) David Sontag (New York Univ.) Suvrit Sra (MIT) Karthik Sridharan (Cornell Univ.) Bharath Sriperumbudur (Pennsylvania State Univ.) Erik Sudderth (Brown Univ.) Csaba Szepesvari (Univ. of Alberta) Graham Taylor (Univ. of Guelph) Ambuj Tewari (Univ. of Michigan) Ruth Urner (MPI Tübingen) Benjamin Van Roy (Stanford) Jean-Philippe Vert (MINES ParisTech) Bob Williamson (Data61 and ANU) Jennifer Wortman (Vaughan) Microsoft Research) Lin Xiao (Microsoft Research) Kun Zhang (CMU)

NIPS gratefully acknowledges the generosity of those individuals and organizations who have provided financial support for the NIPS 2016 conference. Their financial support enables us to sponsor student travel and participation, general support to host the conference, and the volunteers who assist during NIPS.

Platinum Sponsors American International Group, Inc. (AIG)’s vision is to become our clients’ most valued insurer. For the past 100 years, we have been a leading international insurance organisation serving customers in more than 100 countries and jurisdictions. AIG serves commercial, institutional, and individual customers through one of the most extensive worldwide property-casualty networks of any insurer. At AIG, we believe that harnessing the power of machine learning and deep learning techniques is essential to go beyond merely generating new insights from data but also to systematically enhance individual human judgement in real business contexts. If you are also feeling passionate about being a catalyst for evidence-based decision making across the world, let’s connect! Apple revolutionized personal technology with the introduction of the Macintosh in 1984. Today, Apple leads the world in innovation with iPhone, iPad, the Mac & Apple Watch. Apple’s three software platforms— iOS, OS X & watchOS—provide seamless experiences across all Apple devices & empower people with breakthrough services including the App Store, Apple Music, Apple Pay & iCloud. Apple’s 100,000 employees are dedicated to making the best products on earth, and to leaving the world better than we found it. The Audi Group, with its brands Audi, Ducati and Lamborghini, is one of the most successful manufacturers of automobiles and motorcycles in the premium segment. It is present in more than 100 markets worldwide and produces at 16 locations in twelve countries. In the second half of 2016, the production of the Audi Q5 will start in San José Chiapa (Mexico). 100-percent subsidiaries of AUDI AG include quattro GmbH (Neckarsulm), Automobili Lamborghini S.p.A. (Sant’Agata Bolognese, Italy) and Ducati Motor Holding S.p.A. (Bologna, Italy). Citadel is a worldwide leader in finance that uses next-generation technology and alpha-driven strategies to transform the global economy. We tackle some of the toughest problems in the industry by pushing ourselves to be the best again and again. It’s demanding work for the brightest minds, but we wouldn’t have it any other way. Here, great ideas can come from anyone. Everyone. You. DCVC is a venture capital fund that invests in entrepreneurs applying deep compute, big data and IT infrastructure technologies to transform giant industries. DCVC and its principals have backed brilliant people changing global-scale businesses for 20+ years, helping create billions of dollars of wealth for these entrepreneurs while also making the world a markedly better place. At DeepMind, our mission is to solve intelligence and then use that to make the world a better place. Our motivation in all we do is to maximise the positive and transformative impact of AI. We believe that AI should ultimately belong to the world, in order to benefit the many and not the few, and we steadfastly research, publish and implement our work to that end. Didi Chuxing is the world’s largest comprehensive one-stop mobile transportation platform. The company offers a full range of mobile technology-based transportation options for close to 300 million users across over 400 Chinese cities, including taxi hailing, private car hailing, Hitch, Chauffeur, DiDi Bus, DiDi Test Drive, and DiDi Enterprise Solutions. In August 2016, Didi Chuxing acquired Uber China. Didi Chuxing is also growing in global markets. In particular, in the United States, Didi provides Private car hailing services in more than 200 cities via our strategic alliance with Lyft. Facebook’s mission is to give people the power to share and make the world more open and connectedthis requires constant innovation. At Facebook, we believe the most interesting research questions are derived from real world problems. Working on cutting edge research with a practical focus, we push product boundaries while finding new ways to collaborate with the academic community. Google’s mission is to organize the world‘s information and make it universally accessible and useful. Perhaps as remarkable as two Stanford research students having the ambition to found a company with such a lofty objective is the progress the company has made to that end. Ten years ago, Larry Page and Sergey Brin applied their research to an interesting problem and invented the world’s most popular search engine. The same spirit holds true at Google today. The mission of research at Google is to deliver cutting-edge innovation that improves Google products and enriches the lives of all who use them. We publish innovation through industry standards, and our researchers are often helping to define not just today’s products but also tomorrow’s. Intel, the world leader in silicon innovation, develops technologies, products and initiatives to continually advance how people work and live. Intel’s innovations in cloud computing, data center, IoT, & PC solutions are powering the smart and connected digital world. Learn more about Intel’s vision for the future of artificial intelligence at www.intel.com/ai. At KLA-Tencor, we research, develop, and manufacture the world’s most advanced inspection and measurement equipment for the semiconductor industry. We enable the digital age by pushing the boundaries of optics, sensors, image processing, machine learning and computing technologies, creating systems capable of finding Nano-scale defects at 50 GB/second data rates. If you are passionate in driving R&D in advanced deep learning, 3D sensor fusion, Bayesian & Physics based Machine Learning, advanced Neural & HPC architectures, then KLA-Tencor is the place for you.

3

Platinum Sponsors At Microsoft, we aim to empower every person and every organization on the planet to achieve more. We care deeply about having a global perspective and making a difference in lives and organizations in all corners of the planet. This involves playing a small part in the most fundamental of human activities: Creating tools that enable each of us along our journey to become something more. Our mission is grounded in both the world in which we live and the future we strive to create. Today, we live in a mobile-first, cloud-first world, and we aim to enable our customers to thrive in this world. Tencent, Inc. is China’s largest and most used Internet service portal. Tencent’s mission to enhance the quality of human life through Internet services. Presently, Tencent provides social platforms and digital content services under the “Connection” Strategy. Tencent’s leading Internet platforms in China – QQ (QQ Instant Messenger), Weixin/WeChat, QQ.com, QQ Games, Qzone, and Tenpay – have brought together China’s largest Internet community, to meet the various needs of Internet users including communication, information, entertainment, financial services and others. Winton is a British-based global investment management and data technology company. We believe the best approach to investing is the application of the scientific method. Combining statistics and mathematical modelling with cutting-edge technology, we create and evolve intelligent systems to invest in global financial markets, on behalf of our clients around the world. Winton was established in 1997 by David Harding, a pioneer in the development of investment systems who has founded two of the world’s most successful investment management firms.

Gold Sponsors Adobe is the global leader in digital marketing and digital media solutions. Our tools and services allow our customers to create groundbreaking digital content, deploy it across media and devices, measure and optimize it over time and achieve greater business success. We help our customers make, manage, measure and monetize their content across every channel and screen. ALIBABA GROUP’S MISSION IS TO MAKE IT EASY TO DO BUSINESS ANYWHERE. We operate leading online and mobile marketplaces in retail and wholesale trade, as well as cloud computing and other services. We provide technology and services to enable consumers, merchants, and other participants to conduct commerce in our ecosystem. Amazon.com strives to be Earth’s most customer-centric company where people can find and discover virtually anything they want to buy online. The world’s brightest technology minds come to Amazon.com to research and develop technology that improves the lives of shoppers, sellers and developers. AtlaSense is an A.I. platform for modern legal departments who want to take control of their data. Whether for eDiscovery, Contract Gathering or Records Management, AtlaSense virtually finds information anywhere and automatically classifies your files based on your preferences. Baidu, Inc. is the leading Chinese language Internet search provider. As a technology-based media company, Baidu provides the best and most equitable way for people to find they’re looking for. In addition to serving Internet search users, Baidu provides an effective platform for businesses to reach potential customers. Baidu’s ADSs trade on the NASDAQ Global Select Market under the symbol “BIDU” Criteo Research is pioneering innovations in computational advertising. As the center of scientific excellence in the company, Criteo Research delivers both fundamental and applied scientific leadership through published research, product innovations and new technologies powering the company’s products. IBM Research embraces Grand Challenges like Deep Blue and Watson, and is continually extending Watson’s Cognitive capabilities to enable real-world transformations throughout various businesses. It is home to 3000+ researchers including 5 Nobel Laureates, 9 US National Medals of Technology, 5 US National Medals of Science, 6 Turing Awards, and 13 Inductees in the National Inventors Hall of Fame. NVIDIA awakened the world to computer graphics when it invented the GPU in 1999. Researchers utilize GPUs to advance the frontiers of science with high performance computing. Industry and academia leverage them for deep learning to make groundbreaking improvements across a variety of applications including image classification, video analytics and speech recognition. www.nvidia.co.uk Founded in 2007 by leading machine learning scientists, The Voleon Group designs, develops, and implements advanced technology for investment management. We are committed to solving largescale financial prediction problems with statistical machine learning. 4

Silver Sponsors Qihoo 360 Technology Co. Ltd. is a leading Internet platform company in China as measured by our active user base of 496 million active Internet users. Recognizing security as a fundamental need of all Internet and mobile users, Qihoo 360 built a large user base by offering comprehensive, effective, cloudbased and user-friendly Internet and mobile security products. Qihoo 360 is one of top three Internet companies measured by user base.

Automotive Safety Te c h n o l og i es

Automotive Safety Technologies GmbH was founded in 2009 in Gaimersheim/Ingolstadt. Our areas of competency cover the full spectrum of development for integrated safety systems. From function and software development to required simulation and testing competency through to tool and process development in the area of integrated safety – from a single source. Bloomberg technology helps drive the world’s financial markets. We provide communications platforms, data, analytics, trading platforms, news and information for the world’s leading financial market participants. We deliver through our unrivaled software, digital platforms, mobile applications and state of the art hardware developed by Bloomberg technologists for Bloomberg customers. Our over 4,800 technologists work to define, architect, build and deploy complete systems and solutions that anticipate and fulfill our clients’ needs and market demands. Best known as the world’s number one automotive supplier, Bosch also has a broad product portfolio in industrial technology and consumer goods. In Corporate Research, researchers work on technological breakthroughs such as in software development and autonomous systems. In this way, new ideas are constantly taking shape improving existing products, while opening up entirely new lines of business. Cubist Systematic Strategies, our systematic investing business, deploys systematic, computer-driven trading strategies across multiple liquid asset classes, including equities, futures, and foreign exchange. The core of our effort is rigorous research into a wide range of market anomalies, fueled by our unparalleled access to a wide range of publicly available data sources. The organization is structured to strongly support investment professionals in their research efforts. The D. E. Shaw group is a global investment and technology development firm with more than 38 billion in investment capital as of July 1, 2016, and offices in North America, Europe, and Asia. Since our founding in 1988, our firm has earned an international reputation for successful investing based on innovation, careful risk management, and the quality and depth of our staff. eBay Inc. (NASDAQ: EBAY) is a global commerce leader including the Marketplace, StubHub and Classifieds platforms. We connect millions of buyers and sellers around the world, empowering people and creating opportunity through Connected Commerce. Founded in 1995 in San Jose CA, eBay is one of the world’s largest and most vibrant marketplaces for discovering great value and unique selection. In 2015, eBay enabled 82 billion of gross merchandise volume. FeatureX is a well-funded machine learning startup next to MIT, and we’re hiring. We analyze data to extract time-series features related to economic activity, and then use statistical machine learning to build real-time models at various scales - from global macroeconomic activity down to the performance of a single company. One key dataset is satellite imagery, so we’re deep into computer vision. Sound interesting? Visit our booth at NIPS or apply at www.featurex.ai. G-Research researches investment strategies to predict price returns in financial markets. We develop the research and execution platform to deploy these ideas in markets globally. Our Machine Learning team develop techniques and apply them to seek patterns in large, dirty and noisy data sets. Their codebase is at the forefront of machine learning, pushing its boundaries to new and exciting areas! Hutchin Hill Capital manages approximately 3.4 billion with a global staff of 170. We are a multi-strategy manager focused on liquid investments in fundamental & quantitative strategies. We seek to generate attractive, risk-adjusted returns with zero beta and low correlation to traditional risk assets through investments in four distinct core strategies: equities, credit, macro and quantitative. Jump Trading is a leading research-focused trading firm that combines sophisticated quantitative research, best-in-class technology, and an entrepreneurial culture across offices in Chicago, New York, London and Singapore. We foster intellectual curiosity and learning so employees can leverage petaflops of computing power and petabytes of data identify trends in global markets across asset classes. At AHL we mix mathematics, computer science, statistics and engineering with terabytes of data to understand and predict financial markets. Led by research and technology, we build models and write programs that invest billions of dollars every day. We are a small flat-structured company that seeks the best. Start Your AI Startup In Canada! Announcing NextAI, a program and innovation hub in Toronto, Canada for individuals and teams who will solve major problems using AI. We select the most promising entrepreneurs and innovators and provide them with access to leading AI tools and scientists, founder development, capital and exposure to the corporate network to turn their ideas into reality. 5

Silver Sponsors Thirty years ago, Optiver started business as a single trader on the floor of the Amsterdam’s options exchange. Today we are at the forefront of trading and technology, employing over 950 Optiverians from over 40 nationalities. We stick to what we’re good at: making markets in a wide range of financial products. At Optiver we solve puzzles together over breakfast and spend our time using state-of-the-art data-science, quantitative models and technological systems to improve our Trading. Innovation through quantification is what gets us going. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its products and services businesses, including its semiconductor business, QCT. Renaissance Technologies is a quantitative hedge fund management company founded by James Simons in 1982 and located in East Setauket, NY. Renaissance has 300 employees, 90 of whom have PhDs in mathematics, physics, statistics, or computer science. The firm’s trading is based on models developed through the application of machine learning to massive quantities of financial data. Rosetta Analytics is a newly formed investment management firm committed to using new data sources and new computational methods to uncover actionable investment signals. Through collaboration and coinvestment with our clients, these signals are used to create customized but scalable investment strategies. As market leader in enterprise application software, SAP (NYSE: SAP) helps companies of all sizes and industries run better. From back office to boardroom, warehouse to storefront, desktop to mobile device – SAP empowers people and organizations to work together more efficiently and use business insight more effectively to stay ahead of the competition. Sentient has created the largest and most powerful distributed intelligent system in the world. We use this platform to create products which will transform large sectors of the economy, by harnessing massive data sets to solve their most complex problems. Our artificial intelligence (AI) can identify and answer critical questions in new and groundbreaking ways, and act autonomously, while empowering people and businesses to make smarter decisions. Learn more at: sentient.ai. We imagine breakthroughs in investment management, insurance and related fields by pushing the boundaries of what open source and proprietary technology can do. In the process, we work to help real people. Through our investors, we support the retirements of millions around the world and help fund breakthrough research, education and a wide range of charities and foundations. As the innovation hub of United Technologies Corp., United Technologies Research Center and its Ireland subsidiary, United Technologies Research Centre Ireland, Ltd., work to develop game-changing technologies and capabilities across the company and collaborate with external research organizations, universities and government agencies globally to push the boundaries of science and technology.

Bronze Sponsors BenevolentAI is a British technology company harnessing the power of AI to enhance and accelerate scientific discovery by turning the world’s highly fragmented scientific research data into new insight and usable knowledge that benefits society. Simply put, the company is bringing people and technology together to revolutionise the process of scientific discovery.

北京大数据研究院

Beijing Institute of Big Data Research (BIBDR) is a new institution jointly sponsored by the Peking University and government of Beijing. It is the first big data institution in China that combines education, research, entrepreneurship and government service. Our mission is to developing educational programs for data science in China and as a platform for nurturing new enterprises in big data. Cheetah Mobile (NYSE:CMCM) is a leading mobile application developer, the #2 largest internet and mobile security corporation in China and the #3 global non-gaming developer on Google Play, with over 2.3 billion downloads times worldwide and 634 million mobile Monthly Active Users. Datatang is a global data provider. We are the trusted partner to many of the most influential corporations and institutions in the world. At Datatang, we believe we could connect the dots of data by providing ample opportunities for data exchange with diversified resources and ability to work toward new solutions together with our clients. Disney Research’s objective is to drive value across The Walt Disney Company by injecting scientific & technological innovation. Our world-class research seeks to invent and transfer the most compelling technologies enabling the company to differentiate its content, services, and products. Disney Research combines the best of academia and industry, by doing both basic and application-driven research. Invenia Labs develops machine learning techniques to solve some of the world’s most complex forecasting and decision problems. Located in Cambridge (UK), Invenia’s current mission is to improve the efficiency of electrical grids, helping to reduce pollution and fight the global climate crisis.

6

Maluuba’s research powers a new era of artificial intelligence. We are driven by the single purpose of building great experiences powered by natural language processing. Our Montreal lab is one of the world’s leading research centres, led by a team of scientists focused on natural language and deep learning. Maluuba’s technology is used in over 50 million devices and experiences around the world.

Bronze Sponsors Nokia is a global leader in the technologies that connect people and things. Powered by the pioneering work of Nokia Bell Labs, our research and innovation division, and Nokia Technologies, we are at the forefront of creating and licensing the technologies that are increasingly at the heart of our connected lives. Tackling Today’s Biggest Challenges. The Mission of Oracle Labs is straightforward: Identify, explore, and transfer new technologies that have the potential to substantially improve Oracle’s business. Oracle’s commitment to R&D is a driving factor in the development of technologies that have kept Oracle at the forefront of the computer industry. Palantir Technologies builds software platforms that help human experts perform powerful, collaborative analysis of data at scale. Palantir’s software is deployed at public institutions, private enterprises, and in the non-profit sector to address the challenges of responsibly making sense of complex, diverse data. Panasonic is focusing on bringing new solutions to an ever-changing environment, full of cutting edge technologies. We apply Deep Learning as a tool to improve the real-life situations of today and the evolving situations of tomorrow. Deep Learning is just one of the key technologies we employ to understand more about each other and how to navigate through our lives: safely, honestly and happily. PDT Partners is a top quantitative hedge fund where world class researchers analyze rich data to develop and deploy model-driven algorithmic trading strategies. We offer a strong track record of hiring, challenging and retaining scientists interested in conducting research where the lab is the financial markets. QuantumBlack is an advanced analytics firm operating at the intersection of strategy, technology & design to improve performance outcomes for organisations. Starting life in the world of F1, looking at problems of race strategy & engineering productivity, QuantumBlack codified their analytical approach into a platform & a process called NERVE which has now been successfully deployed across a variety of industries. RBC Research is the research arm of the Royal Bank of Canada. The team’s mandate is to advance the state of the art in financial technologies by conducting research in machine learning and computer vision. RBC is Canada’s largest financial institution with over 80,000 employees across Banking, Insurance, Wealth Management, Investor & Treasury Services, as well as Capital Markets. Recursion Pharma is 25 people generating biological data as fast as the biggest bio research groups in the world. We’ve won 2M in NIH grants, and this fall closed a 15M series A led by Lux Capital. We’re using cutting edge microscopes to turn human cellular experiments into 100s of TBs of rich biological data, and ML to seek treatments for 100s of diseases as fast as possible. Join us! Schibsted Media Group is a global media company driven by new technology and employing 6,900 people in 30 countries to support 200 million users. We are driving new technology across online marketplaces and newspapers with data-driven products that leverage machine learning, are pioneering digital media houses, and have a portfolio of flourishing digital companies. SigOpt is the optimization platform that amplifies your research, providing an ensemble the state of the art in Bayesian optimization research via a simple API and web interface. SigOpt takes any research pipeline and tunes it, right in place. Our cloud-based platform is used by globally recognized leaders in the insurance, credit card, algorithmic trading and consumer packaged goods industries. Telefónica I+D, the research and development company of the Telefónica Group, was founded in 1988 and its mission is to contribute to the Group’s competitiveness through technological innovation. With this aim, the company applies new ideas, concepts and practices in addition to developing products and advanced services. Yandex is one of the largest internet companies in Europe, operating Russia’s most popular search engine. We provide user-centric products and services based on the latest innovations in information retrieval, machine learning and machine intelligence to a worldwide customer audience on all digital platforms and devices. Headquartered in Moscow, we have development and sales offices in 17 locations across nine countries.

EXHIBITORS

NIPS would like to especially thank Microsoft Research for their donation of Conference Management Toolkit (CMT) software and server space. NIPS appreciates the taping of tutorials, conference, symposia & select workshops. 7

From The President Bienvenido,

The 2016 NIPS Conference and Workshops in Barcelona will host a record number of participants, topping the record number last year in Montréal by over 1800! Barcelona Montreal Montreal Lake Tahoe Lake Tahoe Granada Vancouver

Year Registrations 2016 5680 2015 3800 2014 2478 2013 1902 2012 1610 2011 1394 2010 1301

We had to cap the registration for the first time because of limited space in the Convention Center. Since 2013 the number of registered participants has been growing exponentially and we have outgrown the venues that were planned several years ago. The likelihood of a cap was announced in September when registration opened, but many who in the past registered late were caught by surprise. We apologize to all who we had to turn away. Register early and register often.

The NIPS Workshops attract as many attendees as the NIPS Conference and this year there will be 50 sessions covering a broad range of topics. Despite its rapid growth, NIPS has maintained the highest standards and this year our acceptance rate was 24%. The technical program includes 7 invited talks and 569 accepted papers, selected from a total of 2403 submissions considered by the program committee. Papers presented at the conference will appear in “Advances in Neural Information Processing 29,” edited by Ulrike von Luxburg, Isabelle Guyon, Daniel D. Lee, Masashi Sugiyama and Roman Garnett. In 2017, NIPS will be in Long Beach, California – warmer weather! NIPS will return to Montréal, Canada, in 2018.

Terry Sejnowski

NIPS Foundation President

Upcoming Conferences

For the first time the main NIPS Conference will have two tracks. Researchers from many fields come to NIPS and our goal is to provide a common meeting ground for all. Unlike most large conferences that are multitrack, NIPS has maintained a single track to keep the meeting from fragmenting. Along with more participants NIPS has had more submissions and with two tracks we were able to expand the number of oral talks and still have time for breaks.

Long Beach, California 2017, Dec 4 - 9

The Posner Lecture this year will be given by Yann LeCun and the Breiman Lecture by Susan Holmes. Ed Posner founded NIPS in 1987 and Leo Breiman bridged the statics and machine learning communities. The Symposium track between the end of the conference and the beginning of the workshops was popular last year in Montreal and this year three exciting symposia are on our program. 8

Montreal Canada 2018, Dec 3 - 8

AREA MAP Hilton Diagonal Del Mar Hotel

AC Barcelona Hotel

Metro Tram

Convention Center

Auditorium

GENERAL INFORMATION Registration Desk, Level P0

WIFI

Sunday, December 4: 12:00pm – 8:00pm Monday – Friday: 7:00 am – 6:00 pm Saturday, December 10 : 7:00 am - 12:00pm

SSID: NIPS Password: conference

Opening Reception and Poster Session Monday, December 5 starting at 6:30 pm Coffee breaks and Food service will be in many locations P1 with the exhibitors P2 in the Banquet Hall with more exhibitors. Please see the maps on the next page

Closing Reception Saturday, December 10 at 7:00 pm

POSTER SESSIONS Level P0 Area 5+ 6 + 7 + 8 Monday, Dec. 5 - 6:00 pm – 9:30 pm Tuesday, Dec. 6 - 6:00 pm – 9:30 pm Wednesday, Dec. 7 6:00 pm – 9:30 pm • No pins or special tape will be provided • Take down your poster at 9:30 pm

WORKSHOP LOCATIONS • CCIB (P1, P2, M1) • Hilton Diagonal Del Mar (Ballrooms) • AC Barcelona hotel

Mobile App Step 1: Download and install the Whova app from App Store (for iPhones) or Google Play (for Android phones). Step 2: Sign up in the app using the email address you registered with. Step 3: You’re all set. Now you will be able to: • View the event agenda and plan your schedule. • If you set up your own profile, you can send in-app messages and exchange contact information • Receive update notifications from organizers. • Access agenda, maps, and directions. After downloading, sign up on Whova with the email address that you used to RSVP for our event, or sign up using your social media accounts. If you are asked to enter an invitation code to join the event, please use the following invitation code: nips

Charging Tables P1 Rooms 113 - 117 P2 Banquet Hall

Exhibitor rooms P1 (rooms 113 - 117) P2 Banquet Hall 9

CONFERENCE MAP Plenary & Parallel Tracks

FLOOR P0

Parallel Tracks

Area 1 + 2 Area 3

Areas 5 -8

Posters & Workshops Areas 5 - 8

Workshops

FLOOR P1

Rooms 118 - 125

Rooms 127 - 134

Coffee Breaks + Exhibitor Area Rooms 113 - 117 Rooms 111+112

FLOOR P2

FLOOR M1 Platinum Sponsors

Coffee Breaks + Platinum Sponsor Area Banquet Hall

Tutorials & Workshops 10

Rooms 211+212 VIP Room

FLOOR P2 - Banquet SPONSOR MAPHall

FLOOR P2 -

BATHROOM

Foyer

Walkway

Charging Stations

Amazon

IBM

DCVC

Tencent

Voleon

Alibaba

Google

AIG

Facebook

DiDi

DCVC

Tencent

Citadel

Charging Stations

IBM

Winton

Voleon

Alibaba Cambridge Press

Springer

Microsoft

Intel

AIG

Facebook

DiDi

DeepMind

OutdoorAmazon Terrace

KLA Tencor

Rooms 211 & 212 Tutorials Room #3 Apple

Cambridge Press

Audi

Springer

MIT Press

Walkwa

DeepMind

Foyer Winton

BATHROOM

Apple

Google

KLA Tencor

Intel

ELEVATOR

Banquet Hall

Audi

MIT Press

Microsoft

ELEVATOR

Citadel

Outdoor Terrace

Coffee breaks, food and beverages will be served on these floors. Charging tables also available

FLOOR P1 -

Rooms 113 - 117

Foyer

Walkway

Baidu

Sig Opt

Rooms 113 - 117

Automotive

Jump

Sentient

Next AI

DE Shaw

FLOOR P1 -

RBC

Bosch

Two Sigma

eBay

Feature X

SAP

Optiver

Now Publishers

PDT Partners Oracle Man Ahl

Quantum Black

Qualcomm

Recursion Pharm.

Schibsted

Hutchin Hill

Invenia Labs

Foyer Palantir

Automotive

Jump

Disney Research

Sentient

Maluuba

Criteo

DE Shaw

Bosch

Two Sigma

eBay

Cheetah Mobile

UTRC

Rosetta

X-prize

Nvidia

Data Tang

Zalando

Feature X

Bloomberg

SAP

Alta Sense

Optiver

Telefonica

Next AI

Cubist

RBC

Bejing Big Data

Now Publishers

Renaissance Tech

G-Research

360

Benevolent AI

Baidu

11

Monday Tutorials TIME & DESCRIPTION

LOCATION

8:30 am - 10:30 am - Tutorial Sessions Variational Inference: Foundations and Modern Methods David Blei · Shakir Mohamed · Rajesh Ranganath

Area 1 + 2

Crowdsourcing: Beyond Label Generation Jennifer Wortman Vaughan Deep Reinforcement Learning Through Policy Optimization Pieter Abbeel · John Schulman 11:00 am -- 1:00 pm - Tutorial Sessions

Area 3 Rooms 211 + 212 Coffee break - Level P1, P2 10:30 am - 11 am

Nuts and Bolts of Building AI systems using Deep Learning Andrew Y Ng Natural Language Processing for Computational Social Science Cristian Danescu-Niculescu-Mizil · Lillian J Lee Theory and Algorithms for Forecasting Non-Stationary Time Series Vitaly Kuznetsov · Mehryar Mohri

Area 1 + 2 Area 3 Rooms 211 + 212

1 pm - 2:30 pm - Lunch Break (On Your Own) 2:30 pm - 4:30 pm - Tutorial Sessions Generative Adversarial Networks Ian Goodfellow ML Foundations and Methods for Precision Medicine and Healthcare Suchi Saria · Peter Schulam

Area 1 + 2 Area 3

Large-Scale Optimization: Beyond Stochastic Gradient Descent and Convexity Rooms 211 + 212 Suvrit Sra · Francis Bach Coffee break - Level P1, P2 4:30 pm - 5 pm 5:30 pm - 6:20 pm Invited Talk: Posner Lecture - Predictive Learning Yann LeCun 6:30 pm - 9:30 pm Opening Reception & Posters 12

P1 & P2

Monday Tutorial Session - 8:30 - 10:30 AM Variational Inference: Foundations and Modern Methods Area 1 + 2

David Blei (Columbia Univ.) Shakir Mohamed (DeepMind) Rajesh Ranganath (Princeton Univ.) One of the core problems of modern statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in probabilistic modeling, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this tutorial we review and discuss variational inference (VI), a method a that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning and tends to be faster than more traditional methods, such as Markov chain Monte Carlo sampling. Brought into machine learning in the 1990s, recent advances and easier implementation have renewed interest and application of this class of methods. This tutorial aims to provide both an introduction to VI with a modern view of the field, and an overview of the role that probabilistic inference plays in many of the central areas of machine learning.

Crowdsourcing: Beyond Label Generation Area 3

Jennifer Wortman Vaughan (Microsoft Research) This tutorial will showcase some of the most innovative uses of crowdsourcing that have emerged in the past few years. While some have clear and immediate benefits to machine learning, we will also discuss examples in which crowdsourcing has allowed researchers to answer exciting questions in psychology, economics, and other fields. We will discuss best practices for crowdsourcing (such as how and why to maintain a positive relationship with crowdworkers) and available crowdsourcing tools. We will survey recent research examining the effect of incentives on crowdworker performance. Time permitting, we will also touch on recent ethnographic research studying the community of crowdworkers and/or delve into the ethical implications of crowdsourcing.

Deep Reinforcement Learning Through Policy Optimization Rooms 211 + 212

Pieter Abbeel (UC Berkley) John Schulman (Open AI) Deep Reinforcement Learning (Deep RL) has seen several breakthroughs in recent years. In this tutorial we will focus on recent advances in Deep RL through policy gradient methods and actor critic methods. These methods have shown significant success in a wide range of domains, including continuous-action domains such as manipulation, locomotion, and flight. They have also achieved the state of the art in discrete action domains such as Atari. Fundamentally, there are two types of gradient calculations: likelihood ratio gradients (aka score function gradients) and path derivative gradients (aka perturbation analysis gradients). We will teach policy gradient methods of each type, connect with Actor-Critic methods (which learn both a value function and a policy), and cover a generalized view of the

The tutorial has three parts. First, we provide a broad review of variational inference from several perspectives. This part serves as an introduction (or review) of its central concepts. Second, we develop and connect some of the pivotal tools for VI that have been developed in the last few years, tools like Monte Carlo gradient estimation, black box variational inference, stochastic approximation, and variational auto-encoders. These methods have lead to a resurgence of research and applications of VI. Finally, we discuss some of the unsolved problems in VI and point to promising research directions. Learning objectives: • Gain a well-grounded understanding of modern advances in variational inference • Understand how to implement basic versions for a wide class of models • Understand connections and different names used in other related research areas • Understand important problems in variational inference research Target audience: • Machine learning researchers across all level of experience from first year grad students to other more experienced researchers • Targeted at those who want to understand recent advances in variational inference • Basic understanding of probability is sufficient

Despite the inclusion of best practices and tools, this tutorial should not be viewed as a prescriptive guide for applying existing techniques. The goals of the tutorial are to inspire you to find novel ways of using crowdsourcing in your own research and to provide you with the resources you need to avoid common pitfalls when you do. Target audience: This tutorial is open to anyone who wants to learn more about cutting edge research in crowdsourcing. No assumptions will be made about the audience’s familiarity with either crowdsourcing or specific machine learning techniques. Anyone who is curious is welcome to attend! As the tutorial approaches, more information will be available on the tutorial website: http://www.jennwv.com/projects/crowdtutorial.html

computation of gradients of expectations through Stochastic Computation Graphs. Learning Objectives: The objective is to provide attendees with a good understanding of foundations as well as recent advances in policy gradient methods and actor critic methods. Approaches that will be taught: Likelihood Ratio Policy Gradient (REINFORCE), Natural Policy Gradient, Trust Region Policy Optimization, Generalized Advantage Estimation, Asynchronous Advantage Actor Critic, Path Derivative Policy Gradients, Deterministic Policy Gradient, Stochastic Value Gradients, Guided Policy Search. As well as a generalized view of the computation of gradients of expectations through Stochastic Computation Graphs. Target Audience: Machine learning researchers. RL background not assumed, but some prior familiarity with the basic concepts could be helpful. Good resource: Sutton and Barto Chapters 3 & 4 (http://webdocs. cs.ualberta.ca/~sutton/book/the-book.html). 13

Monday Tutorial Session - 11 AM - 1 PM Nuts and Bolts of Building AI systems using Deep Learning Area 1 & 2

Andrew Y Ng (Stanford University) How do you get deep learning to work in your business, product, or scientific study? The rise of highly scalable deep learning techniques is changing how you can best approach AI problems. This includes how you define your train/dev/test split, how you organize your data, how you should think through your search among promising model architectures, and even how you might develop new AI-enabled products. In this tutorial, you’ll learn about the emerging best practices in this nascent area. You’ll come away able to better organize your and your team’s work when developing deep learning applications.

Natural Language Processing for Computational Social Science Area 3

Cristian Danescu-Niculescu-Mizil (Cornell University) Lillian J Lee (Cornell University) More and more of life is now manifested online, and many of the digital traces that are left by human activity are increasingly recorded in natural-language format. This tutorial will examine the opportunities for natural language processing (NLP) to contribute to computational social science, facilitating our understanding of how humans interact with others at both grand and intimate scales.

Theory and Algorithms for Forecasting NonStationary Time Series Rooms 211 + 212

Vitaly Kuznetsov (Google) Mehryar Mohri (Courant Institute, Google) Time series appear in a variety of key real-world applications such as signal processing, including audio and video processing; the analysis of natural phenomena such as local weather, global temperature, and earthquakes; the study of economic variables such as stock values, sales amounts, energy demand; and many other areas. But, while time series forecasting is critical for many applications, it has received little attention in the ML community in recent years, probably due to a lack of familiarity with time series and the fact that standard i.i.d. learning concepts and tools are not readily applicable in that scenario. This tutorial precisely addresses these and many other related questions. It provides theoretical and algorithmic tools for research related to time series and for designing new solutions. We first

14

Learning objectives: • Understand best-practices for applying deep learning in your organization, whether improving existing applications or creating brand new ones. • Be able to organize and help prioritize your team’s work using principles suited to the deep learning era; understand how these practices have changed relative to previous machine learning eras. • Able to apply error analysis and other debugging techniques suited to deep learning systems. • Gain a systematic process for selecting among architectures and data for your machine learning tasks. Target audience: Attendees should have basic knowledge of machine learning (such as supervised learning). Prior knowledge in deep learning is helpful but not required.

Learning objectives: • Influence and persuasion: Can language choices affect whether a political ad is successful, a social-media post gets more re-shares, or a get-out-the-vote campaign will work? • Language as a reflection of social processes: can we detect status differences, or more broadly, the roles people take in online communities? How does language define collective identity, or signal imminent departure from a community? • Group success: can language cues help us predict whether a group will cohere or fracture? Or whether a betrayal is forthcoming? Or whether a team will succeed at its task? • Understand important problems in variational inference research Target audience: Unrestricted

present a concise introduction to time series, including basic concepts, common challenges and standard models. Next, we discuss important statistical learning tools and results developed in recent years and show how they are useful for deriving guarantees and designing algorithms both in stationary and non-stationary scenarios. Finally, we show how the online learning framework can be leveraged to derive algorithms that tackle important and notoriously difficult problems including model selection and ensemble methods. Learning objectives: • familiarization with basic time series concepts • introduction to statistical learning theory and algorithms for stationary and non-stationary time series • introduction to model selection and ensemble methods for time series via online learning Target audience: This tutorial is targeted for a very general ML audience and should be accessible to most machine learning researchers and practitioners. We will introduce all the necessary tools from scratch and of course make slides and other detailed tutorial documents available.

Monday Tutorial Session - 2:30 - 4:30 PM Session Chair: Tamara Broderick

Generative Adversarial Networks Area 1 & 2

Ian Goodfellow (OpenAI) Generative adversarial networks (GANs) are a recently introduced class of generative models, designed to produce realistic samples. This tutorial is intended to be accessible to an audience who has no experience with GANs, and should prepare the audience to make original research contributions applying GANs or improving the core GAN algorithms. GANs are universal approximators of probability distributions. Such models generally have an intractable loglikelihood gradient, and require approximations such as Markov chain Monte Carlo or variational lower bounds to make learning feasible. GANs avoid using either of these classes of approximations. The learning process consists of a game between two adversaries: a generator network that attempts to produce realistic samples, and a discriminator network that attempts to identify whether samples originated from the training data or from the generative model. At the Nash equilibrium of this game, the generator network reproduces the data distribution exactly, and the discriminator network cannot distinguish samples from the model from training data. Both networks can be trained using stochastic gradient descent with exact gradients computed by maximum likelihood.

Topics include: - An introduction to the basics of GANs. - A review of work applying GANs to large image generation. - Extending the GAN framework to approximate maximum likelihood, rather than minimizing the Jensen-Shannon divergence. - Improved model architectures that yield better learning in GANs. - Semi-supervised learning with GANs. - Research frontiers, including guaranteeing convergence of the GAN game. - Other applications of adversarial learning, such as domain adaptation and privacy. Learning objectives: • To explain the fundamentals of how GANs work to someone who has not heard of them previously • To bring the audience up to date on image generation applications of GANs • To prepare the audience to make original contributions to generative modeling research Target audience: People who are interested in generative modeling. Both people who do not have prior knowledge of GANs and people who do should find something worthwhile, but the first part of the tutorial will be less interesting to people who have prior knowledge of GANs

ML Foundations and Methods for Precision Medicine and Healthcare

making” tools in medicine. We will also cover example data sources and describe ongoing national initiatives that provide a way for you to get involved.

Suchi Saria (Johns Hopkins University) Peter Schulam (Johns Hopkins University)

Learning objectives: • Become familiar with important (computational) problems in precision medicine and individualized health care • Get introduced to state-of-the-art approaches • Hear about relevant datasets (and potential funding sources).

Area 3

Electronic health records and high throughput measurement technologies are changing the practice of healthcare to become more algorithmic and data-driven. This offers an exciting opportunity for machine learning to impact healthcare. The aim of this tutorial is to introduce you to the most important challenges and techniques for developing “personalized decision-

Large-Scale Optimization: Beyond Stochastic Gradient Descent and Convexity Rooms 211 + 212

Suvrit Sra (MIT) Francis Bach (INRIA) Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a staple introduced over 60 years ago! Recent years have, however, brought an exciting new development: variance reduction (VR) for stochastic methods. These VR methods excel in settings where more than one pass through the training data is allowed, achieving convergence faster than SGD, in theory as well as practice. These speedups underline the huge surge of interest in VR methods; by now a large body of work has emerged, while new results appear regularly! This tutorial brings to the wider machine learning audience the key principles behind VR methods, by positioning them vis-à-vis SGD. Moreover, the tutorial takes a step beyond convexity and covers research-edge

Target audience: The majority of this tutorial will be targeted at an audience with basic machine learning knowledge. No background in medicine or health care is needed. We will make our slides and any relevant documents accessible after the talk.

results for non-convex problems too, while outlining key points and as yet open challenges. Learning objectives: Introduce fast stochastic methods to the wider ML audience to go beyond a 60-year-old algorithm (SGD) – Provide a guiding light through this fast moving area, to unify, and simplify its presentation, outline common pitfalls, and to demystify its capabilities – Raise awareness about open challenges in the area, and thereby spur future research. Target audience: • Graduate students (masters as well as PhD stream) • ML researchers in academia and industry who are not experts in stochastic optimization • Practitioners who want to widen their repertoire of tools

15

Posner Lecture - Monday @ 5:30pm Predictive Learning Area 1 & 2

Yann LeCun (Facebook, New York University) Deep learning has been at the root of significant progress in many application areas, such as computer perception and natural language processing. But almost all of these systems currently use supervised learning with human-curated labels. The challenge of the next several years is to let machines learn from raw, unlabeled data, such as images, videos and text. Intelligent systems today do not possess “common sense”, which humans and animals acquire by observing the world, acting in it, and understanding the physical constraints of it. I will argue that allowing machine to learn predictive models of the world is key to significant progress in artificial intelligence, and a necessary component of model-based planning and reinforcement learning. The main technical difficulty is that the world is only partially predictable. A general formulation of unsupervised learning that deals with partial predictability will be presented. The formulation connects many wellknown approaches to unsupervised learning, as well as new and exciting ones such as adversarial training.

Yann LeCun is Director of AI Research at Facebook, and Silver Professor of Data Science, Computer Science, Neural Science, and Electrical Engineering at New York University. He received the Electrical Engineer Diploma from ESIEE, Paris in 1983, and a PhD in Computer Science from Université Pierre et Marie Curie (Paris) in 1987. After a postdoc at the University of Toronto, he joined AT&T Bell Laboratories in Holmdel, NJ in 1988. He became head of the Image Processing Research Department at AT&T LabsResearch in 1996, and joined NYU as a professor in 2003, after a brief period as a Fellow of the NEC Research Institute in Princeton. From 2012 to 2014 he directed NYU’s initiative in data science and became the founding director of the NYU Center for Data Science. He was named Director of AI Research at Facebook in late 2013 and retains a part-time position on the NYU faculty. His current interests include AI, machine learning, computer perception, mobile robotics, and computational neuroscience. He has published over 180 technical papers and book chapters on these topics as well as on neural networks, handwriting recognition, image processing and compression, and on dedicated circuits for computer perception.

Monday Poster Session #1 Improved Dropout for Shallow and Deep Learning Zhe Li, Boqing Gong, Tianbao Yang #2 Communication-Optimal Distributed Clustering Jiecao Chen, He Sun, David Woodruff, Qin Zhang #3 On Robustness of Kernel Clustering Bowei Yan, Purnamrita Sarkar #4 Combinatorial semi-bandit with known covariance Rémy Degenne, Vianney Perchet #5 A posteriori error bounds for joint matrix decomposition problems Nicolo Colombo, Nikos Vlassis #6 Object based Scene Representations using Fisher Scores of Local Subspace Projections Mandar D Dixit, Nuno Vasconcelos #7 MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild Gregory Rogez, Cordelia Schmid #8 Regret of Queueing Bandits Subhashini Krishnasamy, Rajat Sen, Ramesh Johari, Sanjay Shakkottai #9 Efficient Nonparametric Smoothness Estimation Shashank Singh, SimonDu S Du, Barnabas Poczos #10 Completely random measures for modelling blockstructured sparse networks Tue Herlau, Mikkel N. N Schmidt, Morten Mørup #11 DISCO Nets : DISsimilarity COefficients Networks Diane Bouchacourt, Pawan K Mudigonda, Sebastian Nowozin #12 An Architecture for Deep, Hierarchical Generative Models Philip Bachman #13 A Multi-Batch L-BFGS Method for Machine Learning Albert S Berahas, Jorge Nocedal, Martin Takac 16

#14 Higher-Order Factorization Machines Mathieu Blondel, Akinori Fujino, Naonori Ueda, Masakazu Ishihata #15 A Bio-inspired Redundant Sensing Architecture Anh Tuan Nguyen, Jian Xu, Zhi Yang #16 Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods Lev Bogolubsky, Pavel Dvurechenskii, Alexander Gasnikov, Gleb Gusev, Yurii Nesterov, Andrei M Raigorodskii, altsoph Tikhonov, Maksim Zhukovskii #17 Linear Relaxations for Finding Diverse Elements in Metric Spaces Aditya Bhaskara, Mehrdad Ghadiri, Vahab Mirrokni, Ola Svensson #18 Stochastic Optimization for Large-scale Optimal Transport Aude Genevay, Marco Cuturi, Gabriel Peyré, Francis Bach #19 Threshold Bandits, With and Without Censored Feedback Jacob D Abernethy, Kareem Amin, Ruihao Zhu #20 Mistake Bounds for Binary Matrix Completion Mark Herbster, Stephen Pasteris, Massimiliano Pontil #21 Learning Sound Representations from Unlabeled Video Yusuf Aytar, Carl Vondrick, Antonio Torralba #22 Doubly Convolutional Neural Networks Shuangfei Zhai, Yu Cheng, Zhongfei (Mark) Zhang #23 Maximizing Influence in an Ising Network: A Mean-Field Optimal Solution Christopher Lynn, Daniel D Lee #24 Learning from Rational Behavior: Predicting Solutions to Unknown Linear Programs Shahin Jabbari, Ryan M Rogers, Aaron Roth, Steen Wu #25 Fairness in Learning: Classic and Contextual Bandits Matthew Joseph, Michael Kearns, Jamie H Morgenstern, Aaron Roth

Monday Poster Session #26 A Powerful Generative Model Using Random Weights for the Deep Image Representation Kun He, Yan Wang, John Hopcroft

#45 A Non-generative Framework and Convex Relaxations for Unsupervised Learning Elad Hazan, Tengyu Ma

#27 Improved Error Bounds for Tree Representations of Metric Spaces Samir Chowdhury, Facundo Mémoli, Zane T Smith

#46 Learning Tree Structured Potential Games Vikas Garg, Tommi Jaakkola

#28 Adaptive optimal training of animal behavior Ji Hyun Bak, Jung Choi, Ilana Witten, Jonathan W Pillow #29 PAC-Bayesian Theory Meets Bayesian Inference Pascal Germain, Francis Bach, Alexandre Lacoste, Simon Lacoste-Julien #30 Nearly Isometric Embedding by Relaxation James McQueen, Marina Meila, Dominique Joncas #31 Graph Clustering: Block-models and model free results Yali Wan, Marina Meila #32 Learning Transferrable Representations for Unsupervised Domain Adaptation Ozan Sener, Hyun Oh Song, Ashutosh Saxena, Silvio Savarese #33 Measuring Neural Net Robustness with Constraints Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, Antonio Criminisi #34 Forward models at Purkinje synapses facilitate cerebellar anticipatory control Ivan Herreros, Xerxes Arsiwalla, Paul Verschure #35 Estimating Nonlinear Neural Response Functions using GP Priors and Kronecker Methods Cristina Savin, Gasper Tkacik #36 A Bayesian method for reducing bias in neural representational similarity analysis Mingbo Cai, Nicolas W Schuck, Jonathan W Pillow, Yael Niv #37 Learning to Communicate with Deep Multi-Agent Reinforcement Learning Jakob Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson #38 Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers Veeru Sadhanala, Yu-Xiang Wang, Ryan J Tibshirani #39 Exponential Family Embeddings Maja Rudolph, Francisco J. R. Ruiz, Stephan Mandt, David Blei #40 k^*-Nearest Neighbors: From Global to Local Oren Anava, Kfir Levy #41 Reward Augmented Maximum Likelihood for Neural Structured Prediction Mohammad Norouzi, Samy Bengio, ZF Chen, Navdeep Jaitly, Mike Schuster, Yonghui Wu, Dale Schuurmans #42 A Probabilistic Model of Social Decision Making based on Reward Maximization Koosha Khalvati, Seongmin A. Park, Jean-Claude Dreher, Rajesh P Rao #43 Active Learning with Oracle Epiphany T.K. Huang, Lihong Li, Ara Vartanian, Saleema Amershi, Jerry Zhu #44 On Regularizing Rademacher Observation Losses Richard Nock

#47 Equality of Opportunity in Supervised Learning Moritz Hardt, Eric Price, Nati Srebro #48 Interaction Networks for Learning about Objects, Relations and Physics Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, koray kavukcuoglu #49 beta-risk: a New Surrogate Risk for Learning from Weakly Labeled Data Valentina Zantedeschi, Rémi Emonet, Marc Sebban #50 Binarized Neural Networks Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio #51 Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning Mehdi Sajjadi, Mehran Javanmardi, Tolga Tasdizen #52 Generating Images with Perceptual Similarity Metrics based on Deep Networks Alexey Dosovitskiy, Thomas Brox #53 Exploiting Tradeoffs for Exact Recovery in Heterogeneous Stochastic Block Models Amin Jalali, Qiyang Han, Ioana Dumitriu, Maryam Fazel #54 Tensor Switching Networks Kenyon Tsai, Andrew M Saxe, David Cox #55 Finite-Dimensional BFRY Priors and Variational Bayesian Inference for Power Law Models Juho Lee, Lancelot F James, Seungjin Choi #56 Temporal Regularized Matrix Factorization for Highdimensional Time Series Prediction Hsiang-Fu (Rofu) Yu, Nikhil Rao, Inderjit S Dhillon #57 Composing graphical models with neural networks for structured representations and fast inference Matthew Johnson, David Duvenaud, Alex Wiltschko, Ryan P Adams, Sandeep R Datta #58 Contextual-MDPs for PAC Reinforcement Learning with Rich Observations Akshay Krishnamurthy, Alekh Agarwal, John Langford #59 Algorithms and matching lower bounds for approximatelyconvex optimization Andrej Risteski, Yuanzhi Li #60 Fast Stochastic Methods for Nonsmooth Nonconvex Optimization Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex J Smola #61 A Simple Practical Accelerated Method for Finite Sums Aaron Defazio #62 Unsupervised Learning for Physical Interaction through Video Prediction Chelsea Finn, Ian Goodfellow, Sergey Levine #63 Threshold Learning for Optimal Decision Making Nathan F Lepora

17

Monday Poster Session #64 Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks Hao Wang, Xingjian SHI, Dit-Yan Yeung #65 Finding significant combinations of features in the presence of categorical covariates Laetitia Papaxanthos, Felipe Llinares-Lopez, Dean Bodenham, Karsten Borgwardt #66 Synthesizing the preferred inputs for neurons in neural networks via deep generator networks Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, Jeff Clune #67 Learning Infinite RBMs with Frank-Wolfe Wei Ping, Qiang Liu, Alexander Ihler #68 Sorting out typicality with the inverse moment matrix SOS polynomial Edouard Pauwels, Jean B Lasserre #69 Improving PAC Exploration Using the Median of Means Jason Pazis, Ron E Parr, Jonathan P How #70 Reconstructing Parameters of Spreading Models from Partial Observations Andrey Lokhov #71 Dynamic Filter Networks Xu Jia, Bert De Brabandere, Tinne Tuytelaars, Luc V Gool #72 Long-Term Trajectory Planning Using Hierarchical Memory Networks Stephan Zheng, Yisong Yue, Patrick Lucey #73 Cooperative Inverse Reinforcement Learning Dylan Hadfield-Menell, Stuart J Russell, Pieter Abbeel, Anca Dragan #74 Encode, Review, and Decode: Reviewer Module for Caption Generation Zhilin Yang, Ye Yuan, Yuexin Wu, William W Cohen, Russ Salakhutdinov #75 Gradient-based Sampling: An Adaptive Importance Sampling for Least-squares Rong Zhu #76 Robust k-means: a Theoretical Revisit ALEX GEORGOGIANNIS #77 Boosting with Abstention Corinna Cortes, Giulia DeSalvo, Mehryar Mohri #78 Estimating the class prior and posterior from noisy positives and unlabeled data Shantanu J Jain, Martha White, Pedja Radivojac #79 Bootstrap Model Aggregation for Distributed Statistical Learning JUN HAN, Qiang Liu #80 Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling Maria-Florina Balcan, Hongyang Zhang #81 FPNN: Field Probing Neural Networks for 3D Data Yangyan Li, pirk Pirk, Hao Su, Charles R Qi, Leonidas J Guibas #82 Causal meets Submodular: Subset Selection with Directed Information Yuxun Zhou, Costas Spanos 18

#83 Improving Variational Autoencoders with Inverse Autoregressive Flow Diederik P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, Max Welling #84 Adaptive Smoothed Online Multi-Task Learning Keerthiram Murugesan, Hanxiao Liu, Jaime Carbonell, Yiming Yang #85 The Limits of Learning with Missing Data Brian Bullins, Elad Hazan, Tomer Koren #86 Safe Exploration in Finite Markov Decision Processes with Gaussian Processes Matteo Turchetta, Felix Berkenkamp, Andreas Krause #87 Sparse Support Recovery with Non-smooth Loss Functions Kévin Degraux, Gabriel Peyré, Jalal Fadili, Laurent Jacques #88 Crowdsourced Clustering: Querying Edges vs Triangles Ramya Korlakai Vinayak, Babak Hassibi #89 Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain Ian Yen, Xiangru Huang, Kai Zhong, Ruohan Zhang, Pradeep K Ravikumar, Inderjit S Dhillon #90 Sampling for Bayesian Program Learning Kevin Ellis, Armando Solar-Lezama, Josh Tenenbaum #91 Multiple-Play Bandits in the Position-Based Model Paul Lagrée, Claire Vernade, Olivier Cappe #92 Image Restoration Using Very Deep Convolutional EncoderDecoder Networks with Symmetric Skip Connections Xiaojiao Mao, Chunhua Shen, Yu-Bin Yang #93 Optimistic Bandit Convex Optimization Scott Yang, Mehryar Mohri #94 Computing and maximizing influence in linear threshold and triggering models Justin T Khim, Varun Jog, Po-Ling Loh #95 Clustering with Bregman Divergences: an Asymptotic Analysis Chaoyue Liu, Mikhail Belkin #96 Community Detection on Evolving Graphs LEONARDI Leonardi, Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Mohammad Mahdian #97 Dueling Bandits: Beyond Condorcet Winners to General Tournament Solutions Siddartha Y. Ramamohan, Arun Rajkumar, Shivani Agarwal #98 Learning a Metric Embedding for Face Recognition using the Multibatch Method Oren Tadmor, Tal Rosenwein, Shai Shalev-Shwartz, Yonatan Wexler, Amnon Shashua #99 Convergence guarantees for kernel-based quadrature rules in misspecified settings Motonobu Kanagawa, Bharath K. Sriperumbudur, Kenji Fukumizu #100 Stochastic Variational Deep Kernel Learning Andrew G Wilson, Zhiting Hu, Russ Salakhutdinov, Eric P Xing #101 Deep Submodular Functions Brian W Dolhansky, Jeff A Bilmes

Monday Poster Session #102 Scaled Least Squares Estimator for GLMs in Large-Scale Problems Murat A Erdogdu, Lee H Dicker, Mohsen Bayati #103 Matrix Completion and Clustering in Self-Expressive Models Ehsan Elhamifar

#122 Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations Kirthevasan Kandasamy, Gautam Dasarathy, Junier B Oliva, Jeff Schneider, Barnabas Poczos

#104 Stochastic Three-Composite Convex Minimization Alp Yurtsever, Bang Cong Vu, Volkan Cevher

#123 Flexible Models for Microclustering with Applications to Entity Resolution Brenda Betancourt, Giacomo Zanella, Jeff Miller, Hanna Wallach, Abbas Zaidi, Rebecca Steorts

#105 Tree-Structured Reinforcement Learning for Sequential Object Localization Zequn Jie, Xiaodan Liang, Jiashi Feng, Xiaojie Jin, Wen Lu, Shuicheng Yan

#124 Stochastic Gradient Richardson-Romberg Markov Chain Monte Carlo Alain Durmus, Umut Simsekli, Eric Moulines, Roland Badeau, Gaël RICHARD

#106 The non-convex Burer-Monteiro approach works on smooth semidefinite programs Nicolas Boumal, Vlad Voroninski, Afonso Bandeira

#125 Online and Differentially-Private Tensor Decomposition Yining Wang, Anima Anandkumar

#107 Neurons Equipped with Intrinsic Plasticity Learn Stimulus Intensity Statistics Travis Monk, Cristina Savin, Jörg Lücke #108 Greedy Feature Construction Dino Oglic, Thomas Gärtner #109 Dynamic Mode Decomposition with Reproducing Kernels for Koopman Spectral Analysis Yoshinobu Kawahara

#126 Maximal Sparsity with Deep Networks? Bo Xin, Yizhou Wang, Wen Gao, David Wipf #127 Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, kglad Gladkikh, Gleb Gusev, Pavel Serdyukov #128 Geometric Dirichlet Means Algorithm for Topic Inference Mikhail Yurochkin, Long Nguyen

#110 Learning the Number of Neurons in Deep Networks Jose M Alvarez, Mathieu Salzmann

#129 Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models Marc Vuffray, Sidhant Misra, Andrey Lokhov, Michael Chertkov

#111 Strategic Attentive Writer for Learning Macro-Actions Alexander Vezhnevets, Volodymyr Mnih, Simon Osindero, Alex Graves, Oriol Vinyals, John Agapiou, koray kavukcuoglu

#130 Multi-armed Bandits: Competing with Optimal Sequences Zohar Karnin, Oren Anava

#112 Active Learning from Imperfect Labelers Songbai Yan, Kamalika Chaudhuri, Tara Javidi

#131 Catching heuristics are optimal control policies Boris Belousov, Gerhard Neumann, Constantin A Rothkopf, Jan R Peters

#113 Probabilistic Linear Multistep Methods Onur Teymur, Kostas Zygalakis, Ben Calderhead

#132 Fast stochastic optimization on Riemannian manifolds Hongyi Zhang, Sashank J. Reddi, Suvrit Sra

#114 More Supervision, Less Computation: StatisticalComputational Tradeoffs in Weakly Supervised Learning Xinyang Yi, Zhaoran Wang, Zhuoran Yang, Constantine Caramanis, Han Liu

#133 A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order Xiangru Lian, Huan Zhang, Cho-Jui Hsieh, Yijun Huang, Ji Liu

#115 Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula jean barbier, Mohamad Dia, Nicolas Macris, Florent Krzakala, Thibault Lesieur, Lenka Zdeborová

#134 Stochastic Gradient MCMC with Stale Gradients Changyou Chen, Nan Ding, Chunyuan Li, Yizhe Zhang, Lawrence Carin

#116 Coin Betting and Parameter-Free Online Learning Francesco Orabona, David Pal #117 Normalized Spectral Map Synchronization Yanyao Shen, Qixing Huang, Nati Srebro, Sujay Sanghavi #118 On Explore-Then-Commit strategies Aurelien Garivier, Tor Lattimore, Emilie Kaufmann #119 Learning Kernels with Random Features Aman Sinha, John C Duchi #120 Robustness of classifiers: from adversarial to random noise Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard #121 Adaptive Skills Adaptive Partitions (ASAP) Daniel J Mankowitz, Timothy A Mann, Shie Mannor

#135 Disentangling factors of variation in deep representation using adversarial training Michael F Mathieu, Zhizhen Zhao, Aditya Ramesh, Pablo Sprechmann, Yann LeCun #136 Consistent Kernel Mean Estimation for Functions of Random Variables Adam Scibior, Carl-Johann Simon-Gabriel, Ilya Tolstikhin, Prof. Bernhard Schölkopf #137 DECOrrelated feature space partitioning for distributed sparse regression Xiangyu Wang, David B Dunson, Chenlei Leng #138 Coupled Generative Adversarial Networks Ming-Yu Liu, Oncel Tuzel 19

Monday Poster Session #139 Matching Networks for One Shot Learning Oriol Vinyals, Charles Blundell, Timothy Lillicrap, koray kavukcuoglu, Daan Wierstra

#156 Efficient and Robust Spiking Neural Circuit for Navigation Inspired by Echolocating Bats Bipin Rajendran, Pulkit Tandon, Yash H Malviya

#140 Distributed Flexible Nonlinear Tensor Factorization Shandian Zhe, Kai Zhang, Pengyuan Wang, Kuang-chih Lee, Zenglin Xu, Alan Qi, Zoubin Ghahramani

#157 Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning Gang Niu, Marthinus Christoffel du Plessis, Tomoya Sakai, Yao Ma, Masashi Sugiyama

#141 Tracking the Best Expert in Non-stationary Stochastic Environments Chen-Yu Wei, Yi-Te Hong, Chi-Jen Lu #142 Deep Alternative Neural Networks: Exploring Contexts as Early as Possible for Action Recognition Jinzhuo Wang, Wenmin Wang, xiongtao Chen, Ronggang Wang, Wen Gao #143 Learning Parametric Sparse Models for Image SuperResolution Yongbo Li, Weisheng Dong, Xuemei Xie, GUANGMING Shi, Xin Li, Donglai Xu #144 Kernel Observers: Systems-Theoretic Modeling and Inference of Spatiotemporally Evolving Processes Hassan A Kingravi, Harshal R Maske, Girish Chowdhary #145 Learning brain regions via large-scale online structured sparse dictionary learning Elvis DOHMATOB, Arthur Mensch, Gael Varoquaux, Bertrand Thirion #146 Scaling Factorial Hidden Markov Models: Stochastic Variational Inference without Messages Yin Cheng Ng, Pawel M Chilinski, Ricardo Silva #147 A Bandit Framework for Strategic Regression Yang Liu, Yiling Chen #148 Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst #149 Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Qiang Liu, Dilin Wang #150 Deep Learning Models of the Retinal Response to Natural Scenes Lane McIntosh, Niru Maheswaranathan, Aran Nayebi, Surya Ganguli, Stephen Baccus #151 Safe and Efficient Off-Policy Reinforcement Learning Remi Munos, Tom Stepleton, Anna Harutyunyan, Marc Bellemare #152 Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale Firas Abuzaid, Joseph K Bradley, Feynman T Liang, Andrew Feng, Lee Yang, Matei Zaharia, Ameet S Talwalkar #153 Sample Complexity of Automated Mechanism Design Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik #154 Deep Exploration via Bootstrapped DQN Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy #155 Search Improves Label for Active Learning Alina Beygelzimer, Daniel Hsu, John Langford, Chicheng Zhang 20

#158 Quantized Random Projections and Non-Linear Estimation of Cosine Similarity Ping Li, Michael Mitzenmacher, Martin Slawski #159 CNNpack: Packing Convolutional Neural Networks in the Frequency Domain Yunhe Wang, Chang Xu, Shan You, Dacheng Tao, Chao Xu #160 Verification Based Solution for Structured MAB Problems Zohar Karnin #161 Neurally-Guided Procedural Models: Amortized Inference for Procedural Graphics Programs using Neural Networks Daniel Ritchie, Anna Thomas, Pat Hanrahan, Noah Goodman #162 Edge-Exchangeable Graphs and Sparsity Diana Cai, Trevor Campbell, Tamara Broderick #163 Learning and Forecasting Opinion Dynamics in Social Networks Abir De, Isabel Valera, Niloy Ganguly, Sourangshu Bhattacharya, Manuel Gomez Rodriguez #164 Probing the Compositionality of Intuitive Functions Eric Schulz, Josh Tenenbaum, David Duvenaud, Maarten Speekenbrink, Samuel J Gershman #165 Learning shape correspondence with anisotropic convolutional neural networks Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Michael Bronstein #166 Improved Techniques for Training GANs Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen #167 Automated scalable segmentation of neurons from multispectral images Uygar Sümbül, Douglas Roossien, Dawen Cai, John Cunningham, Liam Paninski #168 Optimal Cluster Recovery in the Labeled Stochastic Block Model Se-Young Yun, Alexandre Proutiere #169 Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games Sougata Chaudhuri, Ambuj Tewari #170 Dual Space Gradient Descent for Online Learning Trung Le, Tu Nguyen, Vu Nguyen, Dinh Phung #171 Data Programming: Creating Large Training Sets, Quickly Alexander J Ratner, Christopher M De Sa, Sen Wu, Daniel Selsam, Christopher Ré #172 Near-Optimal Smoothing of Structured Conditional Probability Matrices Moein Falahatgar, Mesrob I Ohannessian, Alon Orlitsky #173 An urn model for majority voting in classification ensembles Victor Soto, Alberto Suárez, Gonzalo Martinez-Muñoz

Monday Poster Session #1



Improved Dropout for Shallow and Deep Learning Zhe Li (The Univ. of Iowa) Boqing Gong (Univ. of Central Florida) Tianbao Yang (Univ. of Iowa)

Dropout has been witnessed with great success in training deep neural networks by independently zeroing out the outputs of neurons at random. It has also received a surge of interest for shallow learning, e.g., logistic regression. However, the independent sampling for dropout could be suboptimal for the sake of convergence. In this paper, we propose to use multinomial sampling for dropout, i.e., sampling features or neurons according to a multinomial distribution with different probabilities for different features/neurons. To exhibit the optimal dropout probabilities, we analyze the shallow learning with multinomial dropout and establish the risk bound for stochastic optimization. By minimizing a sampling dependent factor in the risk bound, we obtain a distribution-dependent dropout with sampling probabilities dependent on the second order statistics of the data distribution. To tackle the issue of evolving distribution of neurons in deep learning, we propose an efficient adaptive dropout (named \textbf{evolutional dropout}) that computes the sampling probabilities on-the-fly from a mini-batch of examples. Empirical studies on several benchmark datasets demonstrate that the proposed dropouts achieve not only much faster convergence and but also a smaller testing error than the standard dropout. For example, on the CIFAR-100 data, the evolutional dropout achieves relative improvements over 10\% on the prediction performance and over 50\% on the convergence speed compared to the standard dropout. #2



Communication-Optimal Distributed Clustering Jiecao Chen (Indiana Univ. Bloomington) He Sun (The Univ. of Bristol) David Woodruff Qin Zhang

Clustering large datasets is a fundamental problem with a number of applications in machine learning. Data is often collected on different sites and clustering needs to be performed in a distributed manner with low communication. We would like the quality of the clustering in the distributed setting to match that in the centralized setting for which all the data resides on a single server. In this work, we study both graph and geometric clustering problems in two distributed models: (1) a point-to-point model, and (2) a model with a broadcast channel. We give protocols in both models which we show are nearly optimal by proving almost matching communication lower bounds. Our work highlights the surprising power of a broadcast channel for clustering problems; roughly speaking, to cluster n points or n vertices in a graph distributed across s servers, for a worst-case partitioning the communication complexity in a point-to-point model is ns, while in the broadcast model it is n + s. We implement our algorithms and demonstrate this phenomenon on real life datasets, showing that our algorithms are also very efficient in practice.

#3



Сэр! - Беккер поднял обе руки, точно признавая свое поражение.  - Меня не интересует ваша колонка. Я из канадского консульства. Я пришел, чтобы убедиться, что с вами все в порядке.

Categories: 1

0 Replies to “Gdmm Assignment 2013 Honda”

Leave a comment

L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *