This was a phenomenal keynote talk by David Silver at the 16th European Workshop on Reinforcement Learning (EWRL) held at the Vrije Universiteit Brussels between Sept. 14 and 16. Silver leads the reinforcement learning team at DeepMind. He was the leader of the team behind many of the spectacular achievements of AI in the past few years, first with Chess and then with Go. AlphaGo was described as the “most important advance in AI in the past 50 years” when he was introduced as the keynote speaker.
Silver began by revisiting the iconic achievement when AlphaGo defeated Lee Sedol at the match in Korea in 2016. He focused in particular on the famous “move 37” made by AlphaGo that stunned the Go world and left the great champion Lee Sedol scratching his head. This was an example of a machine discovering new knowledge that had eluded humans in the centuries they have played the game. The first version of AlphaGo built on human knowledge in reordered plays of the game. The next version started from scratch as a blind slate, and given just the rules, learned by playing games against itself! Through the process of selfplay it rediscovered all the centuries of human knowledge. For example, the corner positions known as joseki. It also discovered new joseki that had eluded humans for all those centuries - a spectacular example of a machine discovering new knowledge that had lain beyond the ken of humans!
The next example of a machine discovering knowledge was in the problem of matrix multiplication. The schoolboy method of multiplying two n X n matrices needs multiplications, but it’s possible to do better by sing divide-and-conquer and using 7 multiplications (instead of 8) to multiply 2 X2 matrices in the base case. This is the famous Schönhage-Strassen algorithm. DeepMind’s AlphaTensor discovered new ways to multiply small matrices thus leading to new ways of beating multiplications. Fortunately, the record for the fastest matrix multiplication is still with humans – for now!
Silver then upped the tempo – why can’t we let machines learn for themselves how to learn?! He showed how an architecture could learn a loss function appropriate for a task. First we went from hand-crafting features to letting the computer learn representations via end-to-end Deep Learning. Now, we can let it learn the loss function and algorithm as well. Richard Sutton pointed out the bitter lesson from 70 years of AI research that general methods that leverage computation are ultimately the most effective, and by a large margin.
Embedl embraces this philosophy in deploying neural architecture search (NAS) methods to find the optimal architecture and parameters for a task automatically based on the task specifications and hardware and energy constraints .