Free Reinforcement Learning Research Proposal Sample
Visual Robot Homing Using Artificial Neural Networks and Reinforcement Learning
Homing through visual navigation in robotics is a complex problem for traditional field of robotics. The complexity of algorithms used for such purposes increase when the efficiency and effectiveness is increased through employing reinforcement learning on neural networks while combining results from reinforcement learning through aggregation . The traditional ways of homing depend on the pattern recognition and image processing that work fine in certain conditions but does not in a number of others. The problem primarily lies in the fixed performance of such systems in almost every type of environment.
Reinforcement Learning is an effective way used in robotics to enable a system to learn about an environment through constantly monitoring the surroundings and recording/comparing the readings. The system may observe the environment through visual means that requires image processing. The reinforcement learning methods can be applied in a system through combining two distinct methods through aggregating their results. Reinforcement learning is primarily used in the environments in which the underlying dynamics and operational aspects of the system are not known .
Neural networks for a robot is somewhat similar to a brain network in the working but the complexity is of lower level. There are various sensors and memory storages that are used in the neural network of a robot. The sensors send sensory data to the main processor which processes it and store the information obtained from the data in the memory storage for later use. Neural networks are heavily used in designing learning systems of robots. Reinforcement learning is also dependent on neural networks in which the information obtained through processing of data from sensors is stored in the system while improving the efficiency and effectiveness of the system .
Methods of Reinforcement Learning
There exist a large number of methods of reinforcement learning; however, for the simplicity of understanding, these methods can be categorized to two major categories of methods of reinforcement learning as used in robotics. The first category of methods of reinforcement learning includes those methods that scan the space of the value functions. This category of methods are primarily exemplified by Temporal Difference approach or simply TD. The second category of methods of reinforcement learning include methods that scan the space of the policies. This category of methods primarily rely on the approach of Evolutionary Algorithm or simply EA. The main goal of all of the reinforcement learning methods is to solve the sequential decision making tasks with the help of trial and error methods through interaction of the system with its environment .
Methods of Aggregation
There are a number of aggregation methods used in the field of robotics for the purpose of aggregating the data to reach to the optimum results. Following is a brief description of major aggregation methods : -
Mean of grades
The mean of grades include sum of all the grades while dividing it by total number of grades.
Each of the grade can be assigned with specific weightage in order to include one grade with higher ratio than other and then the mean can be taken. This method of aggregating is commonly used where one grade has less influential effect on the information received through various means than the other. The trickiest part in this aggregation method is to properly assign weightage to each of the grade in order to reach optimum efficiency.
Median of grades
The median of grades means the grade in the middle or the average of two middle grades. This type of aggregation is used where the grades are present in the order of size.
Mode of grades
The mode of grades is taking the grade which occurs with most frequent values. The applicability of this type of aggregation is more useful in the cases where the grades are not measures in terms of numerical values. This type of aggregation is considered better than mean due to the fact that it does not take any effect from the outlier grades which are way off the mean value.
Sum of grades
A simple method of aggregation is summing all the grades. In this method, the scaling grades can be ignored. This type of aggregation does not change the internal value of the grades to percentage.
3) Research Statement:
The research statement for the proposed study will be as following: -
“To find new ways of homing in robots using reinforcement learning and neural networks”
The methodology that will be adopted in this study includes development of an Aggregated Multiple Reinforcement Learning System based on two distinct reinforcement learning methods. The final result will be taken through combining the results of the two reinforcement learning methods through an appropriate aggregation method. The suitability and unsuitability of each aggregation method will be analyzed in the study while arriving to the conclusive aggregation method for optimum results for the project. Probability maps and convergence methods will be used to improve the efficiency of the new method formulated in the proposed study.
Furmston, Thomas and David Barber. "Variational methods for Reinforcement Learning." JMLR Proceedings 9 (2010): 241-248.
http://pyrorobotics.com. "Robot Learning Using Neural Networks." n.d. 25 April 2015. <http://pyrorobotics.com/?page=Robot_20Learning_20using_20Neural_20Networks>.
Moodle. "Category aggregation." 16 September 2013. 25 April 2015. <https://docs.moodle.org/24/en/Category_aggregation>.
Moriarty, David E., Alan C. Schultz and John J. Grefenstette. "Evolutionary Algorithms for Reinforcement Learning." Journal of Artificial Intelligence 11 (1999): 241-276.
Pomerleau, Dean A. "Neural Network Vision for Robot Driving." n.d.