Artificial Intelligence (AI) is a new trend which has been gaining attention over the past few years. Whether it is smartphones, smart cars or any smart devices, an AI driver is the core of the device. However, for an AI driver to work effectively, it needs slow, steady, consistent and appropriate training. We can properly understand how to train an AI driver by the process of taming a cat.
When the cat acts unruly, you show evidence of anger and even simulated punishment on the cat. However, when the cat does well, you reward the cat with a strip or anything the cat loves so much. After a couple of months of repeating this process, your cat will only long for the positive rewards. Thus, it will not even remember how to act unruly. What happens in this process is “reinforcement learning”
In the process of interacting with the environment (your home with you), an agent (your cat), stimulated by the reward (cat strip) and punishment (evidence of anger) mechanism, gradually learns a set of methods that can maximize its own behaviour patterns (quiet, lying flat). So in fact, raising cats is the same as engaging in artificial intelligence.
AlphaGo
The most famous representative of reinforcement learning is of course AlphaGo: tens of thousands of chess games, fighting each other from left to right and finally becoming the unparalleled god of Go without a teacher. If AlphaGo is regarded as the cat in the above example, then in training, if there is victory, good things will be available to eat and if it loses, it will get beatings. In addition, DeepMind has developed an agent that can surpass human players in 57 Atari games, which also relies on reinforcement learning algorithms. However, the reward and punishment mechanism here will be specifically designed according to different games. For example, when you play the simplest Pac-Man, you get a reward every time you eat beans, and punishment when you hit a ghost gg.
In addition to the vastness of the game field, reinforcement learning (AI driver learning) can actually be used for autonomous driving.
How to train an AI driver
In order to explain how this is achieved more conveniently, here we borrow a prop: Amazon DeepRacer from Amazon Cloud Technology.
A small car that looks very conceptual, with a ratio of 1 to 18 to the real car. The car is equipped with processors, cameras, and even lidars, in order to achieve autonomous driving. Of course, the premise is that we first deploy the trained reinforcement learning (AI driver) algorithm on the car. The training of the algorithm needs to be carried out in a virtual environment. For this reason, Amazon DeepRacer is equipped with a management console, which contains a 3D racing simulator. This allows people to see the training effect of the model more intuitively.
With this set of things, we can try to train an AI driver from scratch by ourselves.
How to do it? Here comes the point:
Let’s say this is a completely straight track in the simulator and an Amazon DeepRacer car in the virtual environment.
Our goal is to get the car to the finish in the shortest time possible – so for this track, the best option is to run the car as much as possible along the centre line, avoiding the extra time caused by detours or out of bounds. To do this, we can slice the track into grids and assign different scores to those grids:
Those close to the middle are given higher points while those on the sides have lower scores. In addition, the parts beyond the range of the track are invalid areas. If you encounter them, you will have to start all over again. After starting the race, at the beginning, the car didn’t know which route is the best. It just rammed around like a headless fly and often dashed off the track.
Gizchina News of the week
But later, with more and more trial and error, under the “command” of the reward function, the car will gradually explore a route that can obtain the highest cumulative score. Ideally, after a period of training and iteration, the algorithm will learn the truth that “straight lines” are the fastest.
Deploy algorithm to car
And then deploying the algorithm to the car, we can harvest a racing car that can run in a straight line. Of course, running in a straight line is only the simplest case. The actual track is generally more complex. In many cases, running along the center line is not the fastest route. For this reason, we need to adjust the training strategy and the design of the reward function.
In practice, the writing of specific functions is also completed through the management console of Amazon DeepRacer. Before writing the function, we can adjust the hyperparameters of the model on it, then define its action space, specify the speed of the car and the angle of the steering, and even… choose the skin of the car, and so on.
Amazon DeepRacer is a complete set of services, quite like a set of visual teaching tools for introductory reinforcement learning. Novices can follow the prompts step by step. If you are interested, you may wish to try it yourself.
Challenge Guinness?
Of course, since it is a racing car, it is natural to pursue speed, the faster the better. And if you want to test whether the AI driver you “trained” is fast enough…
The official Amazon Cloud Technology also held a competition to pull out all the AI drivers trained by everyone and compare them to see who is the fastest.
This league is a serious competition on a global scale. The first session was held in 2018. Up to now, more than 100,000 people have participated. From online simulations to offline physical competitions. The competition is already well-known in the world of machine learning developers.
The China region has also established a special Amazon DeepRacer league for Chinese developers. This year’s China League is divided into two seasons. The monthly competition of each season is divided into the public group and the professional group according to the difficulty of the track and the difficulty of model training. The top-ranked players in the monthly competition group will have the opportunity to advance to the next group or participate in offline competitions.
Prizes
Of course, the competition has prizes. Headphones, keyboards, speakers…
And if you accidentally win a season championship, congratulations, you can buy a ticket to Las Vegas (hotel and convention tickets) for free.
Entry to the Amazon DeepRacer League is free and has no career requirements. It’s just that if you take the exam under the age of 16, you need the permission of your guardian…This year’s competition is still in progress, and if you register an account on the official website, you will automatically get 10 hours of training time on Amazon’s cloud service, and you can apply for a $30 “point card”.
At the same time, the official Amazon Cloud Technology is also engaged in a “challenge the Guinness World Records” activity, with the goal of breaking through the number of 4,387 participants and applying to become the “largest machine learning competition” in the world.
Every contestant this year will be a part of the record – even, everyone has the opportunity to receive a Guinness World Records challenge certificate. The final result of this challenge will be announced in October. This year’s Amazon Cloud Technology Online China Summit will open in October. In addition to the announcement of the results of the Amazon DeepRacer Guinness Challenge, there will be many big names in the field of cloud computing to share and display related technical achievements.