GubboIT - Image
HELPHome
Get started

This app demonstrates image classification using Deep Learning. In other words: the app tells you (predicts) what an image or a frame of a video looks like: dog, cat and so on.

Slide and Video show examples of the app making good or bad predictions.

How does it work?

The app is simple. An image is just fed into a pretrained model to get the prediction. The top 3 prediction values are displayed. A model with 1000 classes (cat, dog, ...) is used by the app. The model (a convnet) is not described here. See Digit app for more info on models, training etc.

Input and output

  • Predict buttton: Get a prediction on current image. You only have to use this button when you are using Video without Auto.
  • Video/Slide button: Switch to video or slide.
  • Auto: Make a prediction per second. Only for video. Only predictions greater than or equal to 25% (or 50%) are displayed. Toggles between Off, 25%, and 50%. The 25% (or 50%) indicator indicates when there is a new prediction greater than or equal to 25% (or 50%).
  • Forward or backward button on slide: Get another slide.
  • Top 3 predictions: A yellow background if the prediction is greater than or equal to 25%, a green background if the prediction is greater than or equal to 50%.
MORE INFOHome
What can we expect?

Probably not too much. The app can not make good predictions on something that the model is not trained on. In the best case you will have a prediction on something that looks like the real thing or you will have just a very stupid prediction.

Our model handles 1000 classes (cat, dog, ...). Around 400 are animals and around 130 are breeds of dog. The 600 remaining classes are covering various things but a lot is of course missing. To make a general app/model that can classify almost anything is very hard (or impossible) - requires a lot of images.

What can we do if we want to add new classes to our model? In principle we have to collect a number of images (a few hundred per class) for these new classes, label the images according to class, modify the model (only slightly), and train again but only for the new classes. So it means some work...

70% accuracy?

The accuracy of our model is around 70%. Does this mean that we can expect that 70% of our predictions will be correct? No we can't. It can be better but probably not. 70% is for the predictions on the test set. The test set is a part of the original images that the model is NOT trained on. The test set is only used to get a measure on how well the model performs on never seen (in training) data. But our images are probably different from the images in the test set. One reason is that we may use images not belonging to any class - how should we know if our image belongs to one of the 1000 classes? Another reason is probably that we may see the animals in the video from other angles e.g. from above or from behind.

Implementation

The app is written in JavaScript using TensorFlow.js for "Machine Learning" and Bootstrap for the UI. All code related to the app is run in the browser. The web server is only keeping the files of the app. The files (including the model) are downloaded to the browser. So the predictions are done in the browser.

The app uses a pretrained MobileNet model and supporting software. For more info see tfjs-models mobilenet at GitHub. "MobileNets are small, low-latency, low-power models parameterized to meet the resource constraints of a variety of use cases". Number of parameters is around 4.2 million. Accuracy is around 70%.

				
Copyright 2019 GubboIT

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Acknowledgments

This app was inspired by the excellent book "Deep Learning with JavaScript" from Mannning Publications.

Video clips from Animals at Skansen, Stockholm.

Home