libann-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[libann-users] Proposed libann directions


From: John Darrington
Subject: [libann-users] Proposed libann directions
Date: Sat, 5 Oct 2002 12:13:46 +0800
User-agent: Mutt/1.3.28i

This is a cross posting to libann-users, libann-dev and to a few
interested individuals.  Please forgive me if you receive this message
more than once.  I think it has relevance to both forums.

Attached is a UML diagram of a proposed redesign/extension of Libann.  I'm not
a UML guru  so there may be gross misuse of the language.  ( That said,
I think that the merits of UML are by far overstated, but that's a
topic for a different mailing list.) Anyway the diagram should give you
an insight into what's in my mind.

In making this re-design, I have tried to concentrate on the
following:

* Applications --- The library should provide a simple means for C++
  programmers to solve problems using Neural Networks, without them
  having to know details about the network's structure or behaviour.


* Developing, Testing and Designing Neural Networks --- The library
  should be useful to Computer Scientists who are familiar with the
  theory and practice of Neural Nets to prototype and test new (and
  old) network  topologies and algorithms.  Successful ones can be
  included in the library for use by application programmers (see
  above).  Testing includes the investigation of behaviour's under
  partial failures (see below).

* Parallel Execution: Where available, the library must (eventually)
  take advantage of parallel processors, and must be tolerant of
  failures in one or more processors.


The structure builds upon the more successful parts of the existing
Libann architecture, which provides a user very orientated interface,
and ideas from Dan Pemstein's ANN++ implementation, which has a lower
level interface suited to the Neural Net developer.  The result of
combining these ideas gives a 3 tier structure as explained below. 


Some notable omissions from the design:

* There's not any provision for high order networks.  However, I
  believe this would be a generalisation of the existing design, and
  involve replacing the Matrix class with a Tensator class.

* The ability to use higher precision floating point variables.  This
  wouldn't be difficult to introduce (another template) but I'm yet to
  be convinced it would be of any benefit.  If anyone thinks they can
  persuade me otherwise then let me know.


A final note:   Design is an iterative process --- you start by
summarising your ideas, implement them to a particular level of
detail, whereupon your understanding has improved, and so you go back
and refine that summary --- and the loop continues.  As such, not
everything here is set in gold.  Some things might not even work.
Essentially, it doesn't matter if in 6 months time, the design is
completely different, so long as it provides a better understanding of
the problem.


Feedback on this design is appreciated.  That applies to potential
developers as well as users of the library.


THE DETAILS
===========


The Application level.
=====================

As this level is aimed at application programmers with only a minimal
knowledge of NNs, the objects in this level are presented in terms of
the `problem' rather than the `solution'.  Ie. there are objects such
as 

   Unsupervised Classifier
   Supervised Classifier
   Associative Memory
    .
    .
   etc


which are templated upon the types of the lower (development) level.
For example:

    SupervisedClassifier<MultiLayerPerceptron> myClassifier;

Since a Multi-Layer Perceptron would be the normal choice for a
Supervised Classifier, a few typedefs might be warranted. eg:
   
    typedef SupervisedClassifier<MultiLayerPerceptron> SuperClassifier;

If however, somebody wants to make use of a new type of Network for
they can do so with:

     SupervisedClassifier<RevolutionaryNetworkType>  sc;

Clearly there are templates which make no sense, like
SupervisedClassifier<Kohonen> and these will produce errors at compile
time.

ClassType
---------

One of the details with classifiers, is concerned with how to identify
the classes.  The current Libann solution is to assume that all
classes will be identified by unique std::strings.   This may not be
most appropriate for all cases.  For example the application
programmer might prefer to use enums.   The future implementation will
therefore have all its classifiers templated on a ClassType.  

     enum int { Male, Female, Hermaphodite} Gender;

     SupervisedClassifier<MultilayerPerceptron,Gender> sc;

Should there be a default ClassType ??


FeatureMap
----------

One of the more successful concepts in Libann is the FeatureMap, and
this will be retained in future versions.  However in accordance with
the above, needs to have its std::string variables replaced by a
templated ClassType.  One limitation of the fact that FeatureMap
implements std::map, which requires its templated type to have a
< operator.



The Development Level
=====================

Anyone wishing to create a new type of neural network typically needs
to specify three things 1) the topology of the network, 2) the
activation function(s) of the network's nodes and 3) the training
algorithm.

To make the developer's life easier, there are objects available in
the base layer which s/he can use, typically through inheritance.
Note there is no *requirement* that such inheritance is used, but the
intention is that it'll be easier to do so.

The library will initially have class definitions for the common
network types (multi-layer perceptrons, Kohonen networks, Boltzmann
machines etc), hopefully more will become available as people
contribute.  Some judgement needs to be exercised in considering what
is a new type of network, and what is a variant on an existing one.
Three options are available to developers, and the most appropriate one
needs to be considered:

      1) Modify an existing network class.  Best for making
         generalisations about an existing class. Probably most suitable
         when adding parameters which had previously been implicitly
         assumed.  Where the developer finds himself adding if
         statements or switches, this option is probably the wrong
         choice.

      2) Take an existing class and inherit from it.  A good option
         when (1) is not suitable, but the normal inheritance pitfalls
         need to be avoided. Example: If the base class has 3 layers,
         you can't inherit it and decide that it now  has 4 layers.

      3) Don't inherit any existing classes in the development level
         (inherit directly from ann::base::Network).  This option
         would be the choice for Nobel prize winning work, which has
         little or no similarities to existing network designs.

The Topology
------------

Methods in the ann::base::Network layer allow the developer to build
up the network topology as s/he desires.  The weight matrix is built
automatically as nodes are added.
There is a dilemma which the designer of a network must address: 

      Since the dimensions of a network depend on both the number of
      classes and the dimensionality of the input data, when does the
      network get its topology decided?

One option is to wait until the data has been presented (training),
another is to allow the user to pre-declare the dimensions.
This would mean an interface is needed in the application level to
make such declarations.  It also places extra onus on the application
programmer. 

Activation Function
-------------------

The activation function used by the network should inherit from
ann::base::ActivationFunction, and it must implement the float
operator()(float) method.
No other stipulations are placed on the activation function.  However,
since it will be replicated for every neuron, care should be taken
with it's size, and the copy constructor must be implemented if
appropriate.


Training Algorithm
------------------

The TrainingAlgorithm is specific to the design of the network.
Consequently no stipulation is imposed other than it must implement
void operator()(void);  
Since the purpose of a training algorithm is to mutate the weights of
the network, the training algorithm contains a pointer to the weight
matrix.
Training algorithms need to know about training data and often
internal parameters of the network.  These data, or references to them
should be passed into the constructor if required.


The Base Level
==============

ActivationFunction
------------------

The base level has an abstract type called ActivationFunction.  This
has the abstract method operator(float): float.  Implementing the
function as an object allows state to be contained within the
function.  Example:  if (x<0) return 0; if(x>0) return 1; else return
previous_value;
In most cases, such state needs to be on a per neuron basis, therefore
the layers of the net have one activationFunction object per neuron.

Training Algorithm
------------------

The purpose of the training algorithm is to mutate the weights of the 
network.  Thus, the base class for TrainingAlgorithm contains a
pointer to the (non-constant) weight matrix.
There's not much more generalisation which can be made about the
training algorithm (it's so dependent on the network type), except
that it needs a method so it can be invoked --- void operator()(void);
Any parameters needed should be passed through the concrete class'
constructor.


Persistence
-----------

Most neural networks have a training phase and an operation phase.
Since the training phase is typically an expensive process, it's
desirable to be able to save the trained network in non-volatile
memory. An abstract class `Persistable' does this.


Exceptions
----------

There are plenty of operations on neural nets whose validity cannot be
easily checked at compile time (invalid dimensions to matrices springs
to mind).  Therefore, extensive use of exceptions will be used.  The
cited example will have it's own exception class and others can be
added as appropriate.




NAMESPACES
==========

My idea is to distinguish the levels by namespaces; all objects in the
library will use the ann:: namespace. Application level objects will
not be further qualified.  Development level objects will have the
namespace ann::dev::  and the base level ann::base::  




-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
Refer to http://pgp.ai.mit.edu  or any PGP keyserver for public key.


Attachment: ann.eps.gz
Description: Binary data

Attachment: pgpsqwvewzKQ7.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]