R. Sutton at. al. explained reinforcement learning in very simple terms[1]:

“Reinforcement learning is learning what to do-how to map situations to actions-so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. In the most interesting and challenging cases, actions may act not only the immediate reward but also the next situation and, through that, all subsequent rewards. These two characteristics trial-and-error search and delayed reward-are the two most important distinguishing features of reinforcement learning”.

Q-learning, a reinforcement learning technique, learns an action-value function. This function provide the  expected utility of pursuing an action in a given state and following a fixed policy afterward. In simpler terms, an agent using Q-learning learns a mapping for which action he should take when he is in one of the states of the environment. This mapping can be viewed as a table, called a Q-table, with rows as states of the agent and columns as all the actions an agent can perform in its environment. Values of each cell in a Q-table signify how favorable an action is given that an agent is in particular state. Therefore, an agent selects the best known action, depending on his current state: arg max Q(s; a).

Every action taken by an agent affects the environment, which may result in a change of the current state for the agent. Based on his action, the agent gets a reward (a real or natural number) or punishment(a negative reward). These rewards are used by the agent to learn. The goal of an agent is to maximize the total reward which he achieves, by learning the actions which are optimal for each state. Hence, the function which calculates quality of state-action combination is given by :

Q : S X A -> R

Initially, random values are set in the Q-table. Thereafter, each time an agent takes an action; a reward is given to agent, which in turn is used to update the values in Q-table. The formula for updating the Q-table is given by:

Q  Learning Fomula

The major advantages of using Q-learning are that it is simple, and it support dynamic online learning.


[1]  R. Sutton and A. Barto. Reinforcement Learning:An Introduction. The MIT Press Cambridge, Massachusetts London, England

This entry was posted in General.

CWJess Implementation of an Expert System Shell for Computing with Words

HUMAN mind has a limited capability for processing a huge amount of detailed information in his environment; thus, to compensate, the brain groups together the information it perceives by its similarity, proximity, or functionality and assigns to each group a name or a “word” in natural language. This classification of information allows human to perform complex tasks and make intelligent decisions in an inherently vague and imprecise environment without any measurements or computation. Inspired by this human capability, Zadeh introduced the machinery of CW as a tool to formulate human reasoning with perceptions drawn from natural language and argued that the addition of CW theory to the existing tools gives rise to the theories with enhanced capabilities to deal with real-world problems and makes it possible to design systems with higher level of machine intelligence [1][2]. To do this, CW offers two principal components, (1) a language for representing the meaning of words taken from natural language, this language is called the Generalized Constraint Language (GCL), and (2) a set of deduction rules for computing and reasoning with words instead of numbers. CW is rooted in fuzzy logic; however, it offers a much more general methodology for fusion of natural language propositions and computation with fuzzy variables. CW inference rules are drawn from various fuzzy domains, such as fuzzy logic, fuzzy arithmetic, fuzzy probability, and fuzzy syllogism. This paper reports a preliminary work on the implementation of a CW inference system on top of JESS expert system shell (CWJess) . The CW reasoning is fully integrated with JESS facts and inference engine and allows knowledge to be specified in terms of GCL assertions.

Read more

Tuning Computer Gaming Agents using Q-Learning

The aim of intelligent techniques, termed game AI, used in computer video games is to provide an interesting and challenging game play to a game player. Being highly sophisticated, these games present game developers with similar kind of requirements and challenges as faced by academic AI community. The game companies claim to use sophisticated game AI to model artificial characters such as computer game bots, intelligent realistic AI agents. However, these bots work via simple routines pre-programmed to suit the game map, game rules, game type, and other parameters unique to each game. Mostly, illusive intelligent behaviors are programmed using simple conditional statements and are hard-coded in the bots’ logic. Moreover, a game programmer has to spend considerable time configuring crisp inputs for these conditional statements. Therefore, we realize a need for machine learning techniques to dynamically improve bots’ behavior and save precious computer programmers’ man-hours. We selected Qlearning, a reinforcement learning technique, to evolve dynamic intelligent bots, as it is a simple, efficient, and online learning algorithm. Machine learning techniques such as reinforcement learning are known to be intractable if they use a detailed model of the world, and also require tuning of various parameters to give satisfactory performance. Therefore, this paper examine Qlearning for evolving a few basic behaviors viz. learning to fight, and planting the bomb for computer game bots. Furthermore, we experimented on how bots would use knowledge learned from abstract models to evolve its behavior in more detailed model of the world.

Read more

Designing BOTs with BDI Agents

In modern computer games, ‘bots’ – Intelligent realistic agents play a prominent role in success of a game in market. Typically, bots are modeled using finitestate machine and then programmed via simple conditional statements which are hard-coded in bots logic. Since these bots have become quite predictable to an experienced games player, she might lose her interest in game. We present a model of bots using BDI agents, which will show more human-like behavior, more believable and will provide more realistic feel to the game. These bots will use the inputs from actual game players to specify her Beliefs, Desires, and Intentions while game playing.

Read more.

Enable URL Rewrite and SEO Friendly URL in Joomla

Although simple, URL rewrite in Joomla can be pain in neck. In order for Joomla URL rewrite to work apache should be configure to allow URL rewrite in .htaccess . If you are not sure, contact your web server administrator. You can even test yourself, see http://www.addedbytes.com/for-beginners/url-rewriting-for-beginners/ .

If you are managing the server then manually enable it. If you are not sure then follow the instruction on following link see: http://www.jarrodoberto.com/articles/2011/11/enabling-mod-rewrite-on-ubuntu

Now go to Joomla installation folder and rename htaccess.txt to .htaccess

mv htaccess.txt .htaccess

Open .htaccess and remove the comment following line (find the following and remove # in front of it):

RewriteBase  /

If your Joomla is not installed in your root folder then:

RewriteBase/<Joomla Directory>

Now go to global configuration in the admin panel and enable:

Search Engine Friendly URLs
Use URL rewriting

If necessary restart the apache server. Try navigating your website it should not be showing index.php.

This entry was posted in General.