Quick Look/Synopsis
Who Is It For: People with basic programming skills (ideally Python) that are interested in sampling a variety of topics related to OpenCV from fundamentals to advanced ML algorithms for computer vision.
What Does the Course Cover: OpenCV for Beginners is a crash-course in the fundamentals of OpenCV from the folks at OpenCV.org. It walks through basic fundamentals of computer vision, and works its way up to a study of several Neural Net algorithms for advanced computer vision.
Where Do I Find the Course: You can find the course on the OpenCV Courses Site.
How Does the Course Cost: $117, unless you buy it as part of a bundle or during a promotion.
Rating: I’d rate my journey with the course a 9/10. It met my expectations, and I learned a lot of different things with a fairly modest time investment. Details below.
Official Intro Video
The OpenCV course video is linked below for your convenience – it explains the course in a fair amount of detail. Likewise, the course syllabus is available here. I’m going to avoid publishing specific non-public information from the course as per the OpenCV T&C’s, but between the syllabus and video, the high level topics are pretty well published, so I think we can discuss those freely in the review.
Understanding the Intent and Level of Detail of the Course
I was coming to the course with a pretty detailed knowledge of Computer Graphics and low-level 3D Rendering from my previous career endeavours. However, I had only a cursory knowledge of OpenCV and the feature set provided by the library. I had previously run into OpenCV in very specific cases and demos provided by other vendors – some Neural Net stuff for Object Detection and Tracking, and a couple run-ins with ArUco Markers during some AR work that I had been involved in. Likewise, I had certainly written Python before, but not for a while, and not with all the latest language features in Python3.
Accordingly, the OpenCV for Beginners course seemed like a great entry point to be able to ramp up quickly, at a high level. And, I’m pleased to say this is exactly what you get with the course – each module takes about an hour or less to complete and explores a a feature or fundamental of OpenCV. You won’t come out of each topic as an expert, but all of the links to the OpenCV API and background information are presented so that you can go dig deep if you wish.
You don’t go in to OpenCV for Beginners expecting a deep, deep dive, but you do get a sample of a variety of contemporary topics and enough information to get out there and learn more about the content that is specifically interesting to you.
Two Courses in One – Conventional Computer Vision and Deep Neural Network Applications
Something that I didn’t really appreciate about the current state of Computer Vision before I took the course, is that we’re really at a crossroads these days in terms of how vision techniques are implemented.
Old school “fundamental techniques” still underpin a lot of what is happening in OpenCV – being able to load and process images and videos, depth and format conversions, annotation, and presentation of data and processes are all common underpinnings to both conventional applications and some of the magic performed by Deep Neural Networks. However, the approach and how processing occurs between the more conventional techniques and DNN algorithm approaches are quite distinct. (Spoiler – The Deep Learning models do a lot of work in a black box, and you can’t see exactly how the magic happens).
The course topics break down roughly as follows into the two broad categories of Fundamentals / Conventional Computer Vision and the Deep Neural Network topics:
CV Fundamentals and Conventional Computer Vision | Deep Neural Nets and Machine Learning |
Module 1: Getting Started With Images Module 2: Basic Image Operations Module 3: Histograms and Colour Segmentation Module 4: Video Processing and Analysis Module 5: Contour and Shape Analysis Module 6: Playing Games Using CV Module 7: Building and Deploying Apps with Streamlit Module 8: Image Filtering and Enhancement Module 9: Lane Detection using Hough Transforms Module 10: Image Restoration Techniques Module 11: Image Registration Techniques Module 12: ArUco Markers for Augmented Reality Module 21: Deploying Applications on the Cloud | Module 13: Deep Learning with OpenCV Module 14: Face and Landmark Detection Module 15: Object Detection Module 16: Object Tracking Module 17: Human Pose Estimation Module 18: Person Segmentation Module 19: Text Detection and OCR Module 20: Super Resolution |
Course Structure
Each module of OpenCV for Beginners follows a core structure:
- Video Presentation – Multiple videos covering the topics introduce each module. They are well produced and cover the information in a reasonable amount of detail. Toward the end, I did end up speeding up the video a bit as I found that once I was familiar with the content, I could digest it a bit faster. The course video player allows this.
- Jupyter Notebook and Sample Code – Each module contains multiple Jupyter Notebooks and Python scripts with examples covering the topics in the module. You can experiment with this code, and tweak and adjust the scripts to see how things change in the CV outputs.
- Exam Questions – Each module finishes with a multiple choice exam. You get two chances to answer each question. Some of the questions are a little tricky, or stray from the syllabus a bit, but I took this as an opportunity to go back and look up the answers as opposed to just staying stumped (it didn’t say it was a “closed book” exam or anything…. after all).
Some modules feature Application sections where the instructors also take you through building a small web app using the Streamlit framework to publish the results to the web. I enjoyed learning about Streamlit – it’s a quick way to publish Python Apps to the Web with a nice UI and easy deployment. These Applications were generally interesting, even though some overhead is required in terms of UI, publishing, etc. that is not directly OpenCV related.
Highlights
I would say I was pretty satisfied with the content in all the modules, but there were a few stand-out examples that really got the gears turning for me. I’m going to separate the CV Fundamentals and conventional modules from the Deep Learning ones as they were such a different flavour that it’s hard to compare.
Conventional Modules
- Colour Segmentation – I hate to admit it, but I never really understood what the HSV colour space was for. This module was great for understanding how to perform colour segmentation (being able to isolate specific colours in an image or video) using the alternate colour space.
- Lane Detection using Hough Transforms – As far as the “conventional” side of the course went, I thought this was probably one of the most practical and interesting applications. It was really interesting to learn how a combination of conventional image processing techniques and the Hough Transform could be used to isolate lane markers from a video stream.
- Image Registration Techniques – From a practical perspective, this module was extremely interesting and covered how to mask out portions of images/videos, and replace them with contents of other images or videos all via OpenCV. The process of detecting feature points and creating a Homography were used for applications like replacements, stitching images together, and more which was all really interesting stuff.
Deep Neural Networks
Ok, so I don’t want to sound like I am opposed to the DNN modules, or that they weren’t truly amazing, but what’s interesting about these modules and how OpenCV handles Deep Neural Networks sort of needs a separation between OpenCV and the DNN’s themselves.
Unfortunately, for better or for worse (I think better?) Deep Neural Nets are pretty opaque in terms of their functionality in an application. Once you’ve formatted your input image or video to the the same specification that the Neural Net was originally trained on, most of the exercise is just to feed frames to the DNN and interpret the results coming out the other side…. it’s quite magical when it works.
In short, here’s what I thought was cool about how OpenCV handles Deep Neural Networks:
- Multiple Framework Support – OpenCV can load models trained and formatted for a variety of Deep Learning frameworks. TensorFlow, PyTorch, Caffe, and DarkNet models are all supported by the API and the course demonstrates how to load and configure forward inference for a variety of different algorithms and module types. This is pretty cool as it can be a bit overwhelming to try parse what the differences between the frameworks are – short answer in Open CV is “It doesn’t matter”.
- Common Underlying Architecture and Process – Basically, all of the fundamentals in OpenCV from the “conventional” side of things carry over to the DNN’s as well. So, the process for loading an image, resampling, doing colour conversions, and so forth applies to the Deep Learning side as well. There’s nothing else to learn – once you know what format the Neural Net was trained in, OpenCV makes it easy to format your image and video data into a compatible format that the net was trained on. The course covers how to do all this.
As for the DNN’s themselves, of course there’s a bunch of magical stuff:
- Facial Landmarks – It was possible to imagine how things like SnapChat filters and other AR face applications were brought to life using Deep Neural Networks with the examples provided. I had never really thought this through before, and didn’t understand how effective a DNN would be at solving for these applications.
- YOLO Object Detection – Having seen a lot of YOLO demos in videos, the actual functionality wasn’t as novel as I had encountered the outputs before in demos. However walking through the architecture of how the detector works, and the performance implications of the different sized YOLOv5 models (tiny, medium, large and XL) in terms of speed and accuracy was quite an interesting study.
- Text Detection and OCR – I also felt this module provided a bunch of food for thought on how to detect text in images and had a very cool application for how to do automatic translation from different languages to English. The foundations were laid for a very cool “auto image translator” app that I would definitely like to explore sometime in the future.
Conclusions
Overall, I thought the course was a fantastic intro to OpenCV for someone who was maybe already a pretty seasoned programmer and had pretty good familiarity with Computer Graphics and how images and videos worked. It hit the sweet spot for me, not having to waste too much time on remedial stuff sifting through the details looking for new information, and at the same time, getting a lot of information and food for thought in a condensed manner. There are a lot of topics I’ll go back and dig into further, and I actually am motivated to go back to OpenCV.org as well for some of their additional training programs.
Rating: 9/10
Loved: Great overview of a wide variety of OpenCV technology and techniques for someone looking to get a high level refresher on the current state of Open CV
Would Improve: Some odd exam questions and confusing moments, a fair amount of time dedicated to how to deploy Application demos via the Streamlit framework, which may or may not be necessary for all.
Thanks OpenCV for pulling together a course like this.