Creating A Simple Neural Network With TensorFlow.js

TensorFlow.js is a JavaScript port of the extremely powerful machine learning framework of the same name, developed by Google. The demonstration apps for TensorFlow.js tend to be bogged down with a lot of UI. This makes for great presentation, but also makes the code a little intimidating for newcomers. I wanted to make a straightforward example of a model that demonstrates the basics of TensorFlow.js. In this article we’ll walk through a simple classification model which solves for XOR.

This article assumes some basic knowledge of machine learning concepts. All code examples are presented in ES6 syntax.

The rules for XOR are simple: Given two boolean values (TRUE / FALSE), if only one is true then return true; otherwise, return false. In the context of the neural network there will be two input nodes that receive a value of either 1 (TRUE) or 0 (FALSE), and two output nodes, one for true and one for false. The network will be trained to light up one of the output nodes, based on the true/false values of the input nodes.

XOR Table

Inputs Outputs
x1 x2 y
0 0 0
0 1 1
1 0 1
1 1 0

In JavaScript, the training data is expressed as two arrays, one for inputs (xs) and one for outputs (ys).

const xs = [[0,0],[0,1],[1,0],[1,1]];
const ys = [0,1,1,0];

The arrays are then converted to tensors for use in the neural network. Think of the tensor as a wrapper object for the array. It provides a ton of utility functions to manipulate arrays but at its core there’s just an array of data. xs is an array of 4 values, each an array of 2 values. Therefore, the shape of xs is expressed as [4,2]. Because this example passes a two dimensional array, TensorFlow can infer its shape. However, it’s good practice to explicitly define the shape.

let xTrain = tf.tensor2d(xs, [4,2]);

The output data is instantiated as a “one-hot” tensor. This means that each output neuron represents a unique value, in this case the two possible values are TRUE or FALSE. Therefore, the tensor has a declared depth of 2 where the first is represented by 0 and the second is represented by 1 in the array.

let yTrain = tf.oneHot(tf.tensor1d(ys).toInt(), 2);

With the inputs and outputs defined, it’s time to set up the neural network model. A sequential model, meaning it consists of layers of neurons which progress in the order they’re defined, is used in this example. Here the input, consisting of 2 values is passed to a layer of 5 neurons. Then passed to a layer of 2 neurons, which is output as a percent certainty (a value between 0 and 1) of the two possible outputs (TRUE or FALSE).

const model = tf.sequential();

model.add(tf.layers.dense({units: 5, activation: 'sigmoid', inputShape: [2]}));
model.add(tf.layers.dense({units: 2, activation: 'softmax', outputShape: [2]}));

The model is trained with the Adam optimiser and a loss function of categoricalCrossentropy. Categorical Crossentropy allows the model to train by associating input values with output categories. In other words, you train it to recognise that the input [0,0] is associated with the output FALSE, input [1,0] is associated with output TRUE and so on.

const optimizer = tf.train.adam(LEARNING_RATE);

model.compile({
    optimizer: optimizer,
    loss: 'categoricalCrossentropy',
    metrics: ['accuracy'],
});

The training is performed by the .fit() method of the model object. This method receives the x and y training data and an optional config object. The configuration can specify the number of epochs (how many times it iterates through the training data) as well as some validation data. In this example, the training data set is so limited in possibilities that its simply passed back in for validation. In a more practical implementation a portion of the training data would be set aside before training to be used for validation of the model.

model.fit(xTrain, yTrain, {
  epochs: EPOCHS,
  validationData: [xTrain, yTrain],
})

The .fit() method returns a promise function that is called when the model is trained. In this example, the promise function is used to test the trained model. An input tensor is instantiated with 2 input values. That tensor is passed to the model’s .predict() method which generates the model output object. The output data is retrieved by calling the dataSync() method (alternatively, you can make an asynchronous call to the data() method) and is passed into a array representing the percent certainty of [FALSE, TRUE]. In this example, for the the input of [0,1], the predicted output would look something like [0.001049950486049056, 0.9989500641822815] which represents a 0.0% certainty of FALSE and a 99.9% certainty of TRUE.

Complete Source ( also available on CodePen )

// Solve for XOR
const LEARNING_RATE = 0.1;
const EPOCHS = 200;

// Define the training data
const xs = [[0,0],[0,1],[1,0],[1,1]];
const ys = [0,1,1,0];

// Instantiate the training tensors
let xTrain = tf.tensor2d(xs, [4,2]);
let yTrain = tf.oneHot(tf.tensor1d(ys).toInt(), 2);

// Define the model.
const model = tf.sequential();
// Set up the network layers
model.add(tf.layers.dense({units: 5, activation: 'sigmoid', inputShape: [2]}));
model.add(tf.layers.dense({units: 2, activation: 'softmax', outputShape: [2]}));
// Define the optimizer
const optimizer = tf.train.adam(LEARNING_RATE);
// Init the model
model.compile({
    optimizer: optimizer,
    loss: 'categoricalCrossentropy',
    metrics: ['accuracy'],
});
// Train the model
const history = model.fit(xTrain, yTrain, {
  epochs: EPOCHS,
  validationData: [xTrain, yTrain],
}).then(()=>{
  // Try the model on a value
   const input = tf.tensor2d([0,1], [1, 2]);
   const predictOut = model.predict(input);
   const logits = Array.from(predictOut.dataSync());
   console.log('prediction', logits, predictOut.argMax(-1).dataSync()[0]);
});
Share on LinkedInTweet about this on TwitterShare on FacebookShare on Google+