GubboIT - Object
HELPHome
Get started

This app demonstrates object detection using Deep Learning. The app tells you (detects) the objects (car, person and so on) in an image and also where an object is located by drawing a bounding box. Object detection is used in self-driving cars.

How does it work?

The app picks up images (frames) from a video. Each image is fed into an algorithm/model. First possible objects are located and then each object is classified. As output you get a list of objects. For each object you get the class and the coordinates. The app draws the bounding boxes on the image.

Input and output

  • Detect button: Detect objects on current image. A badge indicates the number of detected objects and the time in milliseconds used for the detection.
  • ▶/⏸ button: Start or stop the video. The images (frames) are picked up from this video.
  • Auto button: Make max 10 detections per second on the running video. Stop the detections by selecting Auto once more.
  • Object table: Lists the detected objects with number, class, and score (class probability). Not for Auto.
MORE INFOHome
About object detection

The app uses a pretrained COCO-SSD model - a CNN (Convolutional Neural Network) for object detection. COCO (Common Objects in Context) is the dataset on which the model was trained. COCO contains over 200k images of 90 classes. lite_mobilenet_v2 is the base of the CNN. It is small and fast but a little less accurate than other networks. Can be used on mobiles. SSD (Single Shot Detector) is a CNN that predicts both the class and position of each object in the same pass - a single shot. SSD is also fast.

So (simplified) the whole network is composed of a base network, extra layers for object detection, and a NMS (Non-Maximum Suppression) layer. NMS removes overlapping boxes and keeps only one box (the best) for each object.

Problems

  • The app is slow: To look good the 'video' of images containing bounding boxes need a detection speed of near 10 detections/second. This means that each detection can take 100ms. iPhone5 takes near 300ms and my PC (some years old) around 125ms.
  • Video is not displayed: Select and then .
  • Video stops: Can happen when the video is about to loop. Select Auto and then Auto again.
  • Loading model takes a long time: Normal for slow devices. Just wait.
  • MS Edge - the app is not working: Can be fixed but I don't know how performance is affected.

Implementation

The app is written in JavaScript using TensorFlow.js for "Machine Learning" and Bootstrap for the UI. All code related to the app is run in the browser. The web server is only keeping the files of the app. The files (including the model) are downloaded to the browser. So the predictions are done in the browser.

The app uses a pretrained COCO-SSD model and a supporting API. For more info see tfjs-models coco-ssd at GitHub.

				
Copyright 2019 GubboIT

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Acknowledgments

This app was inspired by the excellent book "Deep Learning with JavaScript" from Mannning Publications.

Video clips from Lane Splitting an insane traffic jam in New York City.

Home