AutoML Workshop at PRICAI 2018

Overview

Machine learning has achieved great successes in online advertising, recommender systems, financial market analysis, computer vision, computational linguistics, bioinformatics and many other fields. However, its success crucially relies on human machine learning experts, as human experts are involved to some extent, in all systems design stages (e.g., for selecting appropriate ML architectures and their hyperparameters). As the complexity of these tasks is often beyond non-experts, the rapid growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used easily and without expert knowledge. We call the resulting research area that targets progressive automation of machine learningAutoML.

We are organizing a workshop around AutoML collocated with the 15th Pacific Rim International Conference on Artificial Intelligence August 28, 2018, Nanjing, China. The workshop will feature several keynote talks by leading experts on AutoML and related fields.

The information about Conference Venue can be found here:http://cse.seu.edu.cn/pricai18/attending.html.

Invited Speakers

(in Alphabetical Order)

Hugo Jair Escalante

Hugo Jair Escalante is researcher scientist at Instituto Nacional de Astrofisica, Optica y Electronica, INAOE, Mexico. He holds a PhD in Computer Science, for which he received the best PhD thesis on Artificial Intelligence 2010 award (Mexican Society in Artificial Intelligence). In 2017 he received the UANL research award in exact sciences. He is secretary and member of the board of directors of ChaLearn, a non-profit organism dedicated to organizing challenges, since 2011. He is information officer of the IAPR Technical Committee 12. Since 2017, he is editor of the Springer Series on Challenges in Machine Learning. He has been involved in the organization of several challenges in machine learning and computer vision, collocated with top venues, seehttp://chalearnlap.cvc.uab.es. He has served as co-editor of special issues in IJCV, IEEE TPAMI, and IEEE Transactions on Affective Computing. He has served as area chair for NIPS 2016 – 2018, he is data competition chair of PAKDD 2018 and has been member of the program committee of venues like CVPR, ICPR, ICCV, ECCV, ICML, NIPS, IJCNN. His research interests are on machine learning, challenge organization, and its applications on language and vision.

Joaquin Vanschoren

Joaquin Vanschoren is assistant professor of machine learning at the Eindhoven University of Technology (TU/e). His research focuses on the progressive automation of machine learning. He founded and leadsOpenML.org, an open science platform for machine learning research used all over the world. He obtained several demonstration and application awards, the Dutch Data Prize, and has been invited speaker at ECDA, StatComp, AutoML@ICML, CiML@NIPS, Reproducibility@ICML, and many other conferences. He also co-organized machine learning conferences (e.g. ECMLPKDD 2013, LION 2016, Discovery Science 2018) and many workshops, including the AutoML Workshop series at ICML.

Lars Kotthoff

Lars Kotthoff is Assistant Professor of Computer Science at the University of Wyoming, USA. Previously, he held appointments at the University of British Columbia, Canada, and University College Cork, Ireland. He obtained his PhD at the University of St Andrews, Scotland. His research focuses on applying machine learning to improve the performance and ease of use of approaches to solve hard combinatorial problems and machine learning itself. Dr Kotthoff’s more than 60 publications have garnered ~800 citations and his research has been supported by funding agencies and industry in various countries.

Pavel Brazdil

Pavel B. Brazdil is a senior researcher at LIAAD Inesc Tec, Porto and full professor at FEP, University of Porto, Portugal. He has obtained his PhD in the area of Machine Learning in 1981 at the University of Edinburgh when this area was still in its infancy. Since the 1990’s he has been pioneering the area of metalearning and supervised 3 PhD students in this area (besides 11 others in other areas). He is a co-author of a book on Metalearning, which has now about 400 citations. His interests lie in related areas, such as Data Mining, Algorithm selection, Automatic Machine Learning and Text Mining, among others. He has edited 6 books and more than 110 papers referenced on Google Scholar, of which about 80 are also on ISI / DBLP / Scopus. His articles have obtained more than 3700 citations on Google Scholar and his h-index there is 30. He is a member of the editorial board of the Machine Learning Journal and is a Fellow of ECCAI.

Yang Yu

Yang Yu is an associate professor in the Department of Computer Science, Nanjing University, China. He holds a PhD in Computer Science, for which he received the 2013 National Outstanding Doctoral Dissertation Award of China. His research interest is in machine learning, mainly on reinforcement learning and derivative-free optimization for learning. His work has been published in Artificial Intelligence, IJCAI, AAAI, NIPS, KDD, etc. He was selected as one of the “AI’s 10 to Watch” by IEEE Intelligent Systems in 2018, and received the PAKDD Early Career Award in 2018. He has been granted several conference best paper awards including IDEAL’16, GECCO’11 (theory track), and PAKDD’08. He has served as an Area Chair of IJCAI’18, a Senior PC member of IJCAI’15/17, a Publicity Co-chair of IJCAI’16/17 and IEEE ICDM’16, a Workshop Co-chair of ACML’16 and PRICAI’18.

Yu-Feng Li

Yu-Feng Li is currently an associate professor at National Key Laboratory for Novel Software Technology, Nanjing University. His research focuses on machine learning. Particularly, he is interested in semi-supervised learning, multi-label learning, statistical learning and optimization. He has published over 30 papers in top-tier journal and conferences such as JMLR, TPAMI, AIJ, ICML, NIPS, AAAI, etc. He is/was served as a senior program committee member of top-tier AI conferences such as IJCAI’15, IJCAI’17, AAAI’19, and an editorial board member of machine learning journal special issues. He has received outstanding doctoral dissertation award from China Computer Federation (CCF), outstanding doctoral dissertation award from Jiangsu Province and Microsoft Fellowship Award.

Schedule

August 28, 2018

10:30 – 10:40
Welcome
Wei-Wei Tu
10:40 – 11:30
Invited Talk:A Hands-On Introduction to Automatic Machine Learning
Speaker:Lars Kotthoff
Abstract:Achieving state-of-the-art performance in machine learning is in most cases more of a dark art than a science, with a machine learning black box tweaked repeatedly until the desired results are achieved. Recent advances in hyperparameter optimization make powerful techniques to automatically achieve good performance available to everybody. However, instead of a machine learning black box, we now have a meta machine learning black box to deal with. In this talk, I will give a brief overview of what automated machine learning systems do and highlight some problems and potential solutions. I will then give a hands-on introduction to building your own simple hyperparameter optimization system.
11:30 – 12:10
Invited Talk:AutoML challenges
Speaker:Hugo Jair Escalante
Abstract:Machine learning progress has lead to models that have reported outstanding performance in a number of domains and applications. Despite this progress, current machine learning solutions rely heavily on humans who have to take decisions on formatting data, preprocessing it and making choices on the type of model (e.g., for feature selection or classification) to be used as well as on the optimization of hyperparameters. AutoML (Autonomous Machine Learning) is the subfield of machine learning that aims at removing the user from the design and development of machine learning systems. In this talk the AutoML problem will be defined, concepts and associated challenges will be discussed, a brief review of related work will be presented. The ChaLearn AutoML challenges organized in the period 2015-2018 will be presented. Design of challenges, analyses of results and conclusions will be presented.
12:10 – 14:00
Lunch Break
14:00 – 14:50
Invited Talk:Democratizing and Automating Machine Learning
Speaker:Joaquin Vanschoren
Abstract:Building machine learning systems remains something of a (black) art, requiring a lot of prior experience to compose appropriate ML workflows and their hyperparameters. To democratize machine learning, and make it easily accessible to those who need it, we need a more principled approach to experimentation to understand how to build machine learning systems and progressively automate this process as much as possible. First, we created OpenML, an open science platform allowing scientists to share datasets and train many machine learning models from many software tools in a frictionless yet principled way. It also organizes all results online, providing detailed insight into the performance of machine learning techniques, and allowing a more scientific, data-driven approach to building new machine learning systems. Second, we use this knowledge to create automatic machine learning (AutoML) techniques that learn from these experiments to help people build better models, faster, or automate the process entirely.
14:50 – 15:40
Invited Talk:From safe semi-supervised learning to automation
Speaker:Yu-Feng Li
Abstract:When the amount of labelled data is limited, it is usually expected that semi-supervised learning (SSL) methods exploiting additional unlabelled data will help improve learning performance. In many situations, however, it is reported that SSL methods using unlabelled data may even decrease the learning performance. It is thus desirable to develop safe SSL that often improve performance, while in the worst cases do not decrease the learning performance. Moreover, automatically developing safe SSL methods for a given data set is also important due to the high human cost. In this talk, I introduce our recent progresses in safe SSL and share a very preliminary attempt on automated SSL.
15:40 – 16:00
Coffee Break
16:00 – 16:50 (Video)
Invited Talk:Metalearning for Algorithm Selection and AutoML
Speaker:Pavel Brazdil
Abstract:In this talk, we discuss the ML/DM algorithm selection problem and explain how the metalearning methods can help. We will start by explaining a relatively simple method based on average ranking (AR) and show how the method can be upgraded to take into account both accuracy and time. Then we address the role of dataset characteristics that enable to identify datasets that are similar to the target dataset and conduct thus a more focused search. Then we will turn our attention to more advanced methods, such as Active Testing and SMAC that enable to conduct a more intelligent search for the potentially best algorithm. In recent years, the attention of researchers has turned to the issue of workflows, i.e. executions of different algorithms in a sequence. These can cover for instance the choice of pre-processing methods and, say, classification algorithms. We will explain how the methods discussed earlier can be upgraded to this setting and what kinds of problems lie ahead.
16:50 – 17:40
Invited Talk:Some Progress from Derivative-free Optimization to Experienced Derivative-free Optimization
Speaker:Yi-Qi Hu
Abstract:The performance of machine learning depends on algorithm selection and hyper-parameter optimization. But strong expert knowledge is needed in those processes. With the wide applications of machine learning technology, a technology which can choose suitable algorithm and its hyper-parameter automatically is necessary. So that, automatic machine learning (AutoML) is approaching. Some previous works formulated AutoML as a black-box optimization problem, and it can be solved by derivative-free optimization methods. Because of the high evaluation cost of AutoML problems, the efficiency of derivative-free optimization needs to be improved urgently. In this talk, I will share some ideas about designing high efficient derivative-free optimization methods. Then, I will discuss how to make derivative-free optimization more efficiency by change optimization structure. Further more, I will talk about how to combine some positive properties of AutoML problems to make derivative-free optimization faster.

Organization

Advisor:

Isabelle Guyon, UPSud/INRIA Univ. Paris-Saclay, France & ChaLearn, USA,guyon@clopinet.com

Co-Chairs:

Wei-Wei Tu, 4Paradigm Inc., Beijing, China,tuww.cn@gmail.com

Hugo Jair Escalante, INAOE (Mexico), ChaLearn (USA),hugo.jair@gmail.com

Zhanxing Zhu, Peking University, China,zhanxing.zhu@pku.edu.cn

Yang Yu, LAMDA Group, Nanjing University, China,yuy@nju.edu.cn

About

About AutoML

Previous AutoML Events

Previous AutoML Challenges:The First AutoML ChallengeandThe Second AutoML Challenge.

The Third AutoML Challenge(Part ofNIPS2018 Competition Track)

AutoML workshops can be foundhere.

Microsoft research blog post on AutoML Challenge can be foundhere.

KDD Nuggets post on AutoML Challenge can be foundhere.

I. Guyon et al.A Brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning Without Human Intervention. ICML W 2016.link

I. Guyon et al.Design of the 2015 ChaLearn AutoML challenge. IJCNN 2015.link

Springer Series on Challenges in Machine Learning.link

About LAMDA

LAMDA is affiliated with theNational Key Laboratory for Novel Software Technologyand theDepartment of Computer Science & Technology,Nanjing University, China. It locates at Computer Science and Technology Building in the Xianlin campus of Nanjing University, mainly in Rm910. The Founding Director of LAMDA is Prof.Zhi-Hua Zhou. “LAMDA” means “Learning And Mining from DatA”. The main research interests of LAMDA include machine learning, data mining, pattern recognition, information retrieval, evolutionary computation, neural computation, and some other related areas. Currently our research mainly involves: ensemble learning, semi-supervised and active learning, multi-instance and multi-label learning, cost-sensitive and class-imbalance learning, metric learning, dimensionality reduction and feature selection, structure learning and clustering, theoretical foundations of evolutionary computation, improving comprehensibility, content-based image retrieval, web search and mining, face recognition, computer-aided medical diagnosis, bioinformatics, etc.

About 4Paradigm Inc.

Founded in early 2015, 4Paradigm (https://www.4paradigm.com/) is one of the world’s leading AI technology and service providers for industrial applications. 4Paradigm’s flagship product –theAI Prophet– is an AI development platform that enables enterprises to effortlessly build their own AI applications, and thereby significantly increase their operation’s efficiency. Using theAI Prophet, a company can develop a data-driven “AI Core System”, which could be largely regarded as a second core system next to the traditional transaction-oriented Core Banking System (IBM Mainframe) often found in banks. Beyond this, 4Paradigm has also successfully developed more than 100 AI solutions for use in various settings such as finance, telecommunication andInternet applications. These solutions include, but are not limited to, smart pricing, real-time anti-fraud systems, precision marketing, personalized recommendation and more. And while, it is clear that 4Paradigm can completely set up a new paradigm that an organization uses its data, its scope of services does not stop there. 4Paradigm uses state-of-the-art machine learning technologies and practical experiences to bring together a team of experts ranging from scientists to architects. This team has successfully built China’s largest machine learning system and the world’s first commercial deep learning system. However, 4Paradigm’s success does not stop there. With its core team pioneering the research of “Transfer Learning,” 4Paradigm takes the lead in this area, and as a result, has drawn great attention by worldwide tech giants.

About ChaLearn & CodaLab (Platform Provider, Coordinator)

ChaLearn (http://chalearn.org) is a non-profit organization with vast experience in the organization of academic challenges. ChaLearn is interested in all aspects of challenge organization, including data gathering procedures, evaluation protocols, novel challenge scenarios (e.g., coopetitions), training for challenge organizers, challenge analytics, results dissemination and, ultimately, advancing the state-of-the-art through challenges. ChaLearn is collaborating with the organization of the NIPS 2018 data competition (AutoML Challenge 2018).

Organization Institutes