Tuning models with lengthy training cycles, typically found in deep learning, can be extremely expensive to train and tune. In certain instances, this high cost may even render tuning infeasible for a particular model. Even if tuning is feasible, it is often extremely expensive. Popular methods for tuning these types of models, such as evolutionary algorithms, typically require several orders of magnitude the time and compute as other methods. And techniques like parallelism often come with a degradation of performance trade-off that results in the use of many more expensive computational resources. This leaves most teams with few good options for tuning particular expensive deep learning functions.
But new methods related to task sampling in the tuning process create the chance for teams to dramatically lower the cost of tuning these models. This method, referred to as multitask optimization, combines “strong anytime performance” from bandit-based methods with “strong eventual performance” of Bayesian optimization. As a result, this process can unlock tuning for some deep learning models that have particularly lengthy training and tuning cycles.
During this talk, Scott Clark, CEO of SigOpt, walks through a variety of methods for training models with lengthier training cycles before diving deep on this multitask optimization functionality. The rest of the talk will focus on how this type of method works and explain the ways in which deep learning experts are deploying it today. Finally, we will talk through implications of early findings in this area of research and next steps for exploring this functionality further. This is a particularly valuable and interesting talk for anyone who is working with large data sets or complex deep learning models.