October 09, 2020
Work in progress

How information is propagated through the network

General Architecture

In the Aika neural network neurons and their Activations are represented separately. Thus each neuron may have an arbitrary number of activations. For example, if the input data set is a text and a neuron represents a specific word, then there might be several occurrences of this word and therefore several activations of this neuron. The human brain probably does something similar through the timing of activation spikes.
These activations represent the information that the neural network was able to infer about the input data set. Every activation is directly or indirectly grounded in the input data set. That is by following the input links of an activation down to the input layer of the network one can determine the atomic input information this activation is referring to, similar to an annotation.
Since we have a one-to-many relation between neurons and activations, we also need a one-to-many relation between the synapses and the links between the activations. Therefore we need a linking process to determine which activation are going to be connected which others. Roughly speaking activations can be linked if they are grounded in the same input data.
A consequence of separating the neurons and their activations is that we can not rely on the network topology to give us the chronological sequence in which each activation is processed. Therefore each activation needs a fired timestamp information that describes the point in time when this activation becomes visible to other neurons. In contrast to conventional neural networks, with their predefined layered architecture, the Aika network starts out empty. Neurons and synapses are added during training. There is an underlying set of rules which determine when certain types of neurons or synapses are induced.
For the network to be able to scale to a sufficient number of neurons and still be computed efficiently, we need to restrict the number of synapses that are induced or even considered to be induced.

The weighted sum and the activation function

Like other artificial neural networks the synapses are weighted. To compute the activation value of a neuron, the weighted sum over its input synapses is computed. Then the bias value \(b\) is added to this sum and the result is sent through an activation function \(\varphi\).

$$net_j = {b_j + \sum\limits_{i=0}^N{x_i w_{ij}}}$$ $$y_j = \varphi (net_j)$$ Depending on the type of neuron, different activation functions are used. One commonly used activation function in the Aika network is the rectified hyperbolic tangent function, which is basically the positive half of the \(\tanh()\) function. \[\varphi(x) = \Bigg \{ {0 \atop \tanh(x)} {: x \leq 0 \atop : x > 0}\]

The activation functions are chosen in a way that they clearly distinguish between active and inactive neurons. Only activated neurons are processed and are thus visible to other neurons. These activations are expressed not only by a real valued number but also by an activation object.

Neuron Types

By choosing the weights and the threshold (i.e. the bias) accordingly, neurons can take on the characteristics of boolean logic gates such as an and-gate or an or-gate. Also, neurons need to be able to act more like formal logic. Sure their ability to integrate weak input signals is important, this is something that classical AI has great difficulties with, but there is also the need for strong conjunctive or disjunctive behavior. Surely, these things can be achieved by setting the synapse weights and the bias value appropriately, but still, these neurons are sufficiently different to justify the introduction of different types of neurons for these. I believe that the human brain does something very similar with its excitatory pyramidal neurons and its inhibitory stellate neurons. Now to the part where I see the great similarity to your capsule networks. Basically, there are three types of neurons, pattern neurons, pattern part neurons, and inhibitory neurons. The pattern neurons and the pattern part neurons are both conjunctive in nature and the inhibitory neuron is disjunctive. The pattern part neurons kind of describe which lower-level patterns are part of the current pattern. Each pattern part neuron receives a positive feedback synapse from the pattern to which it belongs to. The pattern neuron on the other hand is only activated if the pattern as a whole is detected. For example, if we consider a word consisting of individual letters as lower-level patterns, then we have a corresponding pattern part neuron for each letter whose meaning is that this letter occurred as part of this word. The pattern part neurons also receive negative feedback synapses from the inhibitory neurons such that competing patterns are able to suppress each other.

Network Topology

Positive and Negative Feedback Loops

Another crucial insight is the need for positive and negative feedback loops. These are synapses that ignore the causal sequence of fired activations. Especially the negative feedback synapses are interesting because they require the introduction of mutually shielded branches for the following activations. They create a kind of independent interpretation of parts of the input data set and only later on its decision which of these interpretations gets selected. It's very similar to what a parse tree does only that a parse tree is limited to syntactic information only. Another way to relate it to classical logic is to consider it as non-monotonic inference, which in classical AI could not be solved properly since classical logic is missing the weak influences that neural networks are able to capture.

Neuron Types

There are two main types of neurons in Aika: excitatory neurons and inhibitory neurons. The biological role models for those neurons are the spiny pyramidal cell and the aspiny stellate cell in the cerebral cortex. The pyramidal cells usually exhibit an excitatory characteristic and some of them possess long ranging axons that connect to other parts of the brain. The stellate cells on the other hand are usually inhibitory interneurons with short axons which form circuits with nearby neurons.
Those two types of neurons also have a different electrical signature. Stellate cells usually react to a constant depolarising current by firing action potentials. This occurs with a relatively constant frequency during the entire stimulus. In contrast, most pyramidal cells are unable to maintain a constant firing rate. Instead, they are firing quickly at the beginning of the stimulus and then reduce the frequency even if the stimulus stays strong. This slowdown over time is called adaption.
Aika tries to mimic this behaviour by using different activation functions for the different types of neurons. Since Aika is not a spiking neural network like the biological counterpart, we only have the neurons activation value which can roughly be interpreted as the firing frequency of a spiking neuron. In a sense the earlier described activation function based on the rectified tanh function quite nicely captures the adaption behaviour of a pyramidal cell. An increase of a weak signal has a strong effect on the neurons output, while an increase on an already strong signal has almost no effect. Furthermore, if the input of the neuron does not surpass a certain threshold then the neuron will not fire at all. For inhibitory neurons Aika uses the rectified linear unit function (ReLU).

$$y = \max(0, x)$$

Especially for strongly disjunctive neurons like the inhibitory neuron, ReLU has the advantage of propagating its input signal exactly as it is, without distortion or loss of information.