Monthly Newsletter | 1st January 2017


  • Current Article
  • News
  • Quote
  • Previous Issue
  • Facebook Posts
  • Contact us
Write to Us at

Reinforcement Learning to solve AI problems

Imagine a robot that can act in a world, receiving rewards and punishments and determining from these what it should do. This is the problem of reinforcement learning.

What is Reinforcement Learning?
Reinforcement Learning in AI consist of collection of computational methods that, although inspired by animal learning principal, are primarily motivated by their potential to solve practical problems. Some of the most impressive achievements of Artificial learning systems have been achieved using Reinforcement Learning which include robot control, elevator scheduling, telecommunications, backgammon, checkers and go (AlphaGo).

Reinforcement Learning and other Machine Learning approaches:

In Supervised Learning, output dataset are provided which is used to train a machine and get the desired outputs whereas in Unsupervised learning no datasets are provided, instead the data is clustered into different classes. Reinforcement Learning differs from standard supervised learning in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected. It contains algorithm presents a state dependent on the input data in which a user rewards or punishes the algorithm via the action the algorithm took, this continues over time.

The Reinforcement Learning problem :
Here is the model of the problem that many AI researchers have adopted in their approaches to reinforcement learning. This model is based on "Markov Decision Process" that has been widely studied by decision makers.

Fig. Reinforcement Learning model

Fig. Reinforcement Learning model representation

This model of reinforcement learning consists of:
  1. Environment with agent states S ,
  2. Actions performed by the agent,
  3. Policies of transitioning from states to actions;
  4. Rules that determine the scalar immediate reward of a transition; and
  5. Rules that describe what the agent observes.
At each step the agent:
  • Executes action at
  • Receives observation ot
  • Receives scalar reward rt
The environment:
  • Receives action at
  • Emits observation ot+1
  • Emits scalar reward rt+1

This agent and environment interact over sequence of discrete time steps. The goal of a reinforcement learning agent is to collect as much reward as possible.

Two components make reinforcement learning powerful: The use of samples to optimize performance and the use of function approximation to deal with large environments. Thanks to these two key components, reinforcement learning can be used in large environments in any of the following situations:
  • A model of the environment is known, but an analytic solution is not available,
  • Only a simulation model of the environment is given.
  • The only way to collect information about the environment is by interacting with it.


Zuckerberg builds artificial intelligence assistant to run house, entertain toddler
Mark Zuckerberg has a new housemate: Jarvis, an artificial intelligence assistant he created this year that can control appliances, play music, recognize faces and, perhaps most impressively, entertain his toddler.
Apple Publishes Its First Artificial Intelligence Paper
Apple has published its very first AI paper on December 22. (The paper was submitted for publication on November 15.) The paper describes a technique for how to improve the training of an algorithm's ability to recognize images using computer-generated images rather than real-world images.
Could online tutors and artificial intelligence be the future of teaching?
Online maths company has partnered with scientists to identify what makes lessons successful - and to see if AI can be used to improve teaching.
World’s largest hedge fund to replace managers with artificial intelligence
The world’s largest hedge fund is building a piece of software to automate the day-to-day management of the firm, including hiring, firing and other strategic decision-making.
In the race to build the best AI, there’s already one clear winner
As Google, Facebook, Microsoft, and Baidu take turns leapfrogging each other in artificial intelligence innovation, one company stands to profit from any outcome: Nvidia.

““The world faces a set of increasingly complex, interdependent and urgent challenges that require ever more sophisticated response. We’d like to think that successful work in artificial intelligence can contribute by augmenting our collective capacity to extract meaningful insight from data and by helping us to innovate new technologies and processes to address some of our toughest global challenges.””

― Demis Hassabis, Shane Legg, and Mustafa Suleyman, Google Deep Mind

Previous Issues

Making A Robot Walk

Dated: 1st December 2016

Decoding Artificial Intelligence, Machine Learning, Deep Learning

Dated: 1st November 2016

View All
While you were away - Where the machine is the artist
First ever craftsmanship exhibition made by artificial intelligence, changing the perspective of experiencing art
Amazon Go makes retail checkout a thing of the past
Now you can go in shop, buy whatever you want and checkout. Even don't need to stand in a long queue for checkout Artificial Intelligence and Machine Learning has made it possible.
Alien hunting gets HUGE boost as Artificial Intelligence joins search for extraterrestrial
Artificial Intelligence will now help finding scary secrets of Alien E.T .Extra Terrestrial
Contact Us

Your reviews are valuable. Help us to improve.
Give your feedback on

Address: 10, Jamanadas Industrial Estate, Phase -1, Dr. R.P. Road, Mulund (West),, Mumbai, Maharashtra 400080

© Copyright Artificial Intelligence Mumbai. All Rights Reserved.