Computer vision has
received a lot of attention recently due to the popularity of Deep learning. Traditional machine learning algorithms are still very
simple. Their training requires a lot of domain expertise and human
intervention when errors occur. OpenCV is one of the most popular library for computer vision.
Every Machine Learning algorithm takes a Dataset as input and learns from this data. The algorithm goes through the data and identifies patterns in the data. Deep learning algorithms, on the other hand, learn about the
task at hand through a network of neurons that map the task as a hierarchy of
concepts.
Gray-scale image buffer which stores image . Each pixel’s brightness is represented by a single 8-bit number, whose range is from 0 (black) to 255 (white)
Gray-scale image buffer which stores image . Each pixel’s brightness is represented by a single 8-bit number, whose range is from 0 (black) to 255 (white)
Deep learning algorithms also perform better when more data is given, which is not typical of machine learning algorithms. The best applications of Google's Tensorflow
are the best for deep learning in general. Deep Learning is great at pattern
recognition/machine perception. It helps
classify and cluster data with greater accuracy. A certain group of pixels may signify an edge in an image or some other pattern. Convolutions use this to help identify images.
OpenCV is one of the most popular library for computer vision
C:\python -m pip install opencv-python
The idea is to provide different faces and allow machine to learn.
Machine learning is to train Algorithms to Learn Patterns and make Predictions from DATA to let computers to automate decision-making processes.
A classifier is trained with a few hundred sample views of a particular object (i.e., a face or an object), called positive examples, that are scaled to the same size (say, 20x20), and negative examples - arbitrary images of the same size.
A Haar Cascade is a classifier is used to detect particular objects from the source. The haarcascade_frontalface_default.xml is a haar cascade designed by OpenCV to detect the frontal face.
The Haar Cascade is by superimposing the positive image over a set of negative images. Haar-features are good at detecting edges and lines.
# training.py
# Ver 1.02
# 3rd Nov 2018
# Import OpenCV2 for image processing
# Import os for file path
#
import cv2, os
# Import numpy for matrix calculation
import numpy as np
# Import Python Image Library (PIL)
from PIL import Image
import os
def assure_path_exists(path):
dir = os.path.dirname(path)
if not os.path.exists(dir):
os.makedirs(dir)
# Create Local Binary Patterns Histograms for face recognition
recognizer = cv2.face.LBPHFaceRecognizer_create()
# Using prebuilt frontal face training model, for face detection
detector = cv2.CascadeClassifier("haarcascade_frontalface_default.xml");
# Create method to get the images and label data
def getImagesAndLabels(path):
# Get all file path
imagePaths = [os.path.join(path,f) for f in os.listdir(path)]
# Initialize empty face sample
faceSamples=[]
# Initialize empty id
ids = []
print("\nTraining started")
# Loop all the file path
for imagePath in imagePaths:
# Get the image and convert it to grayscale
PIL_img = Image.open(imagePath).convert('L')
# PIL image to numpy array
img_numpy = np.array(PIL_img,'uint8')
# Get the image id
id = int(os.path.split(imagePath)[-1].split(".")[1])
print(id)
# Get the face from the training images
faces = detector.detectMultiScale(img_numpy)
# Loop for each face, append to their respective ID
for (x,y,w,h) in faces:
# Add the image to face samples
faceSamples.append(img_numpy[y:y+h,x:x+w])
# Add the ID to IDs
ids.append(id)
# Pass the face array and IDs array
return faceSamples,ids
# Get the faces and IDs
faces,ids = getImagesAndLabels('dataset')
# Train the model using the faces and IDs
recognizer.train(faces, np.array(ids))
# Save the model into trainer.yml
assure_path_exists('trainer/')
recognizer.save('trainer/trainer.yml')
print("\nTraining completed")
As we can see in the above example, accuracy can be misleading. Sometimes it may be desirable to select a model with a lower accuracy because it has a greater predictive power on the problem.
Accuracy is one metric for evaluating classification models. For the above code, I used only 30 images using my webcam. Precision and recall are two model evaluation metrics. While precision refers to the percentage of results which are relevant, recall refers to the percentage of total relevant results correctly classified by the algorithm. Unfortunately, it is not possible to maximize both these metrics at the same time, as one comes at the cost of another
And a final touch on Image enhancement:
#Enhancing an image in Pillow using ImageFilter
from PIL import Image, ImageFilter
#Read image
im = Image.open( 'old.jpg' )
#Display image
im.show()
from PIL import ImageEnhance
enh = ImageEnhance.Contrast(im)
enh.enhance(1.8).show("10% more contrast")
enh.enhance(1.8).show("20% more contrast")
enh.enhance(1.8).show("25% more contrast")
enh.enhance(1.8).show("30% more contrast")
enh.enhance(1.8).show("35% more contrast")
Google s AutoML service dramatically reduces the steps involved in training and tuning a machine learning model. AutoML on Google Cloud is available for translation, natural language, and vision.
We are still not even close to solving computer vision. However, there are already multiple enterprises that have found ways to apply CV systems, powered by CNNs to real-world problems. And this trend is not likely to stop anytime soon.
GAN "generative adversarial networks" can be used to generate photo-realistic images, reconstruct damaged images and remove blurring. With these new techniques and rapidly improving capabilities, computer vision (CV) is progressing toward solving certain security challenges. AI could watch cameras, monitor people and patterns and look for indicators of security concern. AI could systematically search a field of view for objects of interest and look for anomalies. With more active researchers in the field, we can expect to see far more accurate and reliable Computer vision in the near future.
AI and ML Python Computer vision Charts and graphs TensorFlow Health care Financial institutions AI and Insurance industry