Class Details

This section gives details of the main classes in the project...

ModuleBase

ModuleBase serves as a base class for all user-implemented modules. Input and output slots on the module can be accessed specifically by name using the GetInputSlot() and GetOutputSlot() methods, or an enumeration of all available slots can be accessed using the 'Inputs' and 'Outputs' properties. When implementing modules, input and output slots can be added using methods AddInputSlot() and AddOutputSlot(). In classes derived from ModuleBase, the ImplementProcess() method should be overridden to implement the actual functionality of the module. The Samples.Modules project within SimpleML provides numerous sample implementations of modules. Both logging and metrics capture are possible through the protected 'logger' and 'metricLogger' members. This is also demonstrated in SimpleML

ModuleGraph

ModuleGraph is a directed graph of classes implementing the IModule interface. Public methods are available to add and remove modules from the graph, and to create and remove slot links from the modules within the graph. The public property 'EndPoints' provides an enumeration of all the modules considered as end points in the graph.

ModuleGraphProcessor

As well as performing the basic function of processing all of the modules in a graph in their depedency order, ModuleGraphProcessor contains public methods to validate a module graph (e.g. to check for circular references, incorrect data type assignments in slot links etc...), and to make a copy of a module graph. Overloaded constructors accept implementations of IApplicationLogger and IMetricLogger logger interfaces to allow capturing of logging and metric information when processing (demonstrated in SimpleML). Cancellation of processing is also possibile via the CancelProcessing() method.

ModuleGraphXmlSerializer

Class ModuleGraphXmlSerializer is used to serialize module graphs to and from XML documents. An example of the the use of this class is provided in SimpleML.

XmlDataSerializer

The XmlDataSerializer class is used to serialize and deserialize data attached to input slots on modules within a module graph. The AddDataTypeSupport() method can be used to add the ability for the class to serialize and deserialize custom data types. Again this is demonstrated in SimpleML.

Implementing Modules

Several examples of module implementation are provided in the SimpleML.Samples.Modules project. Generally speaking, a module implementation should include the following...

  • A parameterless constructor which defines a description for the module, plus the input and output slots.
  • Overridding of the ImplementProcess() method, which should include...
    1. Code to retrieve the input data from the input slots
    2. Instantiation and execution of composed objects to perform the necessary processing
    3. Code to set the results on the output slots
  • Logging and metrics logging

The code sample below, taken from the SimpleML PolynomialFeatureGenerator class gives an example of this...

public class PolynomialFeatureGenerator : ModuleBase { private const String dataSeriesInputSlotName = "DataSeries"; private const String polynomialDegreeInputSlotName = "PolynomialDegree"; private const String outputMatrixOutputSlotName = "OutputMatrix"; /// <summary> /// Initialises a new instance of the SimpleML.Samples.Modules.PolynomialFeatureGenerator class. /// </summary> public PolynomialFeatureGenerator() : base() { Description = "Generates polynomial features for data series stored column-wise in a SimpleML.Containers.Matrix class"; AddInputSlot(dataSeriesInputSlotName, "The data series to generate the polynomial features for, stored column-wise in a matrix", typeof(Matrix)); AddInputSlot(polynomialDegreeInputSlotName, "The degree of polynomial features to generate", typeof(Int32)); AddOutputSlot(outputMatrixOutputSlotName, "The data series with polynomial features added (stored column-wise)", typeof(Matrix)); } protected override void ImplementProcess() { Matrix dataSeries = (Matrix)GetInputSlot(dataSeriesInputSlotName).DataValue; Int32 polynomialDegree = (Int32)GetInputSlot(polynomialDegreeInputSlotName).DataValue; Matrix outputMatrix = null; try { SimpleML.PolynomialFeatureGenerator polynomialFeatureGenerator = new SimpleML.PolynomialFeatureGenerator(); outputMatrix = polynomialFeatureGenerator.GenerateFeatures(dataSeries, polynomialDegree); GetOutputSlot(outputMatrixOutputSlotName).DataValue = outputMatrix; } catch (Exception e) { logger.Log(this, LogLevel.Critical, "Error occurred whilst generating polynomial features for data series.", e); throw; } logger.Log(this, LogLevel.Information, "Generated polynomial features for data series, producing a matrix with " + outputMatrix.NDimension + " columns."); } }

Guidelines / Recommendations

  • Module classes must always have a parameterless constructor. This is a requirement of the .NET Activator.CreateInstance() method, which is used when deserializing modules in a module graph.
  • When designing modules, a balance must be struck between flexibility (i.e. many small, granular modules), and simplicity (fewer but more complex modules). It's recommended to err towards a greater number of small modules, as this allows greater flexibility in using MMF to arrange the modules into workflows.
  • Encapsulate the code that performs processing inside modules in separate, composed classes, rather than writing lengthy processing code inside the ImplementProcess() method. This allows module's functionality to be reused outside the context of MMF. SimpleML provides an example of this by implementing the actual machine learning functionality in classes in the core SimpleML namespace, and then 'wrapping' these classes with modules in the SimpleML.Samples.Modules namespace.
  • Utilize logging and metric logging
  • Support cancellation in modules with long running operations.

Unit Tests

Complete NUnit unit tests for all classes are provided in the '*.UnitTests' projects. Test class ModuleGraphProcessorScenarioTests contains a number of 'scenario' unit tests, to test the ModuleGraphProcessor classes' handling of complex module graphs, and anomalous cases like graphs with circular references. To aid to understanding and debugging of these unit tests, a diagram of the various module graphs used in the tests is included in file 'Resources\ModuleGraphProcessorScenarioTests.gif'.