Reinforcement Learning

Reinforcement learning is a machine learning approach inspired by behaviorism that deals with what actions subjects must take to achieve the highest amount of reward in an environment. Due to its generality, this problem is also studied in many other branches such as game theory, control theory, operations research, information theory, simulation-based optimization and statistics.

Although not completely different from supervised learning and unsupervised learning methods, Reinforcement Learning – Reinforcement Learning mimics the way people learn.

Just as people can learn by making use of the knowledge they have acquired before or by making comparisons with the ordinary process, in real life, we learn by interacting with the environment and the environment, both by ourselves and by those around us, from the moment we are born. Observing the results of these interactions. Going back to machine learning, the purpose of Machine Learning is to be able to produce intelligent programs through a process of learning and change, often referred to as a learning tool. Reinforced Learning or Reinforced Learning (RL) is one approach that can be considered for this learning process.

In machine learning, the environment is often modeled as a Markov decision process (MKS). Reinforcement learning differs from supervised learning in that the correct input/output matches are not given and non-optimal actions are not externally corrected.

Reinforcement learning, unlike other machine and deep learning structures, consists of data without labels. There is an agent and an environment (environement). The agent basically observes and takes action on this environment. The environment also rewards this agent as feedback. The main goal for the agent in this process is to maximize the reward.

The basic steps are as follows:

  • The agent observes an initial (login) state.
  • The action to be taken is determined by a decision-making function. This is called policy.
  • The action is performed.
  • The agent receives a scalar reward or reinforcement from the environment.
  • Information about this status and the reward for the action pair is recorded.

The agent performs actions in the environment with a lot of scenarios during the training phase. According to the results of these actions, he heals himself according to the rewards and punishments he receives in the environment and takes actions for higher rewards.

Reinforcement Learning is heavily based on the concept of state. When used as an input in the policy and value function; The model is used as both input and output.

It generally emerged from game-based scenarios. However, today, reinforcement learning models are used as a solution to many problems.




Reinforcement Learning Examples
Example 1:
When the chess player decides to move, he plans possible moves and counter-reactions. Identify specific positions and movements with intuitive judgments.
Example 2:
A mobile robot decides whether to enter a new room to collect more garbage. This decision depends on the current charge level of the battery and how quickly and easily you have been able to find a charger in the past.
Example 3:
Psychologically, it plans the answer to how we make decisions in a decision-making process and whether the consequences of those decisions enable us to learn.
Example 4:
In terms of neuroscience, it plans answers to the questions of which regions are located in the brain and how these regions are related to each other.

References:

https://tr.wikipedia.org/wiki/Peki%C5%9Ftirmeli_%C3%B6%C4%9Frenme

https://yz-ai.github.io/blog/pekistirmeli-ogrenme/pekistirmeli-ogrenme-bolum-1

https://medium.com/deep-learning-turkiye/peki%C5%9Ftirmeli-%C3%B6%C4%9Frenmeye-giri%C5%9F-c7c2a8cce50b

https://www.muhendisbeyinler.net/pekistirmeli-ogrenme-reinforcement-learning-nedir/

https://imlab.io/2020/01/05/reinforcement-learning/

https://turkiye.ai/yapay-zeka-pekistirmeli-ogrenme-haline-gelecek/

Stay Informed!

By signing up for our e-bulletin, you can be informed about all our innovations.

"We use cookies to personalize and improve your Sisasoft Website usage experience. By making your visit with default settings, you accept the use of cookies as specified in Sisasoft's

Privacy Policy
0312 227 06 34