How it works

The Artificial Neural Network

Aika is a neural network simulation algorithm, that uses neurons to represent a wide variety of linguistic concepts and connects them via synapses. In contrast to most other neural network architectures, Aika does not employ layers to structure its network. Synapses can connect arbitrary neurons with each other, but the network is not fully connected either.
Like other artificial neural networks (ANN) the synapses are weighted. By choosing the weights and the threshold (i.e. the bias) accordingly, neurons can take on the characteristics of boolean logic gates such as an and-gate or an or-gate. To compute the activation value of a neuron, the weighted sum over its input synapses is computed. Then the bias value \(b\) is added to this sum and the result is sent through an activation function \(\varphi\).

$$net_j = {b_j + \sum\limits_{i=0}^N{x_i w_{ij}}}$$ $$y_j = \varphi (net_j)$$ Depending on the type of neuron, different activation functions are used. One commonly used activation function in Aika is the rectified hyperbolic tangent function, which is basically the positive half of the \(\tanh()\) function. \[\varphi(x) = \Bigg \{ {0 \atop \tanh(x)} {: x \leq 0 \atop : x > 0}\]

The activation functions are chosen in a way that they clearly distinguish between active and inactive neurons. Only activated neurons are processed. These activations are expressed not only by a real valued number but also by an activation object.


The advantage of having activation objects is that, through them, Aika is able to cope with the relational structure of natural language text by making the activation relate to a specific segment of text. In a way these activations can be seen as text annotations that specify their start and end character.
Words, phrases and sentences are in a relation to each other through their sequential order. The assignment of text ranges and word positions to activations is a simple yet powerful representation of the relational structure of text and avoids some of the shortcomings of other representations such as bag of words or sliding window. Since the activations are propagated along through the network, synapses need to be able to manipulate the text range and the word position while the activations are passed on to the next neuron.


One common problem when processing text is that of cyclic dependencies. In the example 'jackson cook' it is impossible to decide which word has to be resolved first, the forename or the surname, since both depend on each other. The word jackson can be recognized as a forename when the next word is a surname and the word cook can be recognized as a surname when the previous word is a forename. To tackle this problem Aika employs non-monotonic logic and is therefore able to derive multiple interpretations that are mutually exclusive. These interpretations are then weighted and only the strongest interpretation is returned as the result.
Consider the following network, which is able to determine whether a word, which has been recognized in a text, is a forename, surname, city name, or profession. If for instance the word "jackson" has been recognized in a text, it will trigger further activations in the two jackson entity neurons. Since both are connected through the inhibiting neuron only one of them can be active in the end. Aika will therefore generate two interpretations. But these interpretations are not limited to a single neuron. For instance if the word neuron cook gets activated too, the jackson forename entity and the cook surname entity will be part of the same interpretation. The forename and surname category links in this example form a positive feedback loop which reinforces this interpretation.

New interpretations are spawned if both input and output of a negative recurrent synapse get activated. In this case a conflict is generated. An interpretation is a conflict free set of activations. Therefore, if there are no conflicts during the processing of an input data set, only one interpretation will exist and the search for the best interpretation will end immediately. On the other hand, if there are conflicts between activations, a search needs to be performed which selects or excludes individual activations and tests how these changes affect the overall weights sum of all activations. This sum is also called the objective function \(f\) and can be stated in the following way: $$f = \sum\limits_{j \in Acts}{\min (-g_j, net_j)}$$ $$g_j = \sum\limits_{i = 0, w_{ij} < 0, w_{ij} \in Recurrent}^N{w_{ij}}$$ The value \(g_j\) is simply the sum of all negative feedback synapses. The intention behind this objective function is to measure a neurons ability to overcome the inhibiting input signals of other neurons.


Synapses in Aika consist not just of a weight value but also properties that specify relations between synapses. These relations can either be used to imply an constrained on the matching input text ranges or the dependency structure of the input activations. Biological neurons seem to achieve such a relation matching through the timing of the firing patterns of their action potentials.
The complete list of synapse properties looks as follows:

                new Synapse.Builder()
                new Synapse.Builder()
                new Relation.Builder()
                        .setRelation(new Equals(END, BEGIN)),
                new Relation.Builder()
                        .setRelation(new Equals(BEGIN, BEGIN)),
                new Relation.Builder()
                        .setRelation(new Equals(END, END))

Unlike other ANNs, Aika allows to specify a bias value per synapse. These bias values are simply summed up and added to the neurons bias value.
The property recurrent states whether this synapse is a feedback loop or not. Depending on the weight of this synapse such a feedback loop might either be positive or negative. This property is an important indicator for the interpretation search.
Range relations define a relation either to another synapse or to the output range of this activation. The range relations consist of four comparison operations between the between the begin and end of the current synapses input activation and the linked synapses input activation.

The next property is the range output which consists of two boolean values. If these are set to true, the range begin and the range end are propagated to the output activation.
The last two properties are used to establish a dependency structure among activations. The instance relation here defines whether this synapse and the linked synapse have a common ancestor or are depending on each other in either direction. During the evaluation of this relation only those synapses are followed which are marked with the identity flag. Depending on the weight of the synapse and the bias sum of the outgoing neuron, Aika distinguishes two types synapses: conjunctive synapses and disjunctive synapses. Conjunctive synapses are stored in the output neuron as inputs and disjunctive synapses are stored in the input neuron as outputs. The reason for this is that disjunctive neurons like the inhibitory neuron or the category neurons may have a huge number of input synapses, which makes it expensive to load them from disk. By storing those synapses on the input neuron side, only those synapses need to stay in memory, that are needed by an currently activated input neuron.

Neuron Types

There are two main types of neurons in Aika: excitatory neurons and inhibitory neurons. The biological role models for those neurons are the spiny pyramidal cell and the aspiny stellate cell in the cerebral cortex. The pyramidal cells usually exhibit an excitatory characteristic and some of them possess long ranging axons that connect to other parts of the brain. The stellate cells on the other hand are usually inhibitory interneurons with short axons which form circuits with nearby neurons.
Those two types of neurons also have a different electrical signature. Stellate cells usually react to a constant depolarising current by firing action potentials. This occurs with a relatively constant frequency during the entire stimulus. In contrast, most pyramidal cells are unable to maintain a constant firing rate. Instead, they are firing quickly at the beginning of the stimulus and then reduce the frequency even if the stimulus stays strong. This slowdown over time is called adaption.
Aika tries to mimic this behaviour by using different activation functions for the different types of neurons. Since Aika is not a spiking neural network like the biological counterpart, we only have the neurons activation value which can roughly be interpreted as the firing frequency of a spiking neuron. In a sense the earlier described activation function based on the rectified tanh function quite nicely captures the adaption behaviour of a pyramidal cell. An increase of a weak signal has a strong effect on the neurons output, while an increase on an already strong signal has almost no effect. Furthermore, if the input of the neuron does not surpass a certain threshold then the neuron will not fire at all. For inhibitory neurons Aika uses the rectified linear unit function (ReLU).

$$y = \max(0, x)$$

Especially for strongly disjunctive neurons like the inhibitory neuron, ReLU has the advantage of propagating its input signal exactly as it is, without distortion or loss of information.