Skip to main content

Computer vision


Computer vision  has received a lot of attention recently due to the popularity of Deep learning. Traditional machine learning algorithms are still very simple. Their training requires a lot of domain expertise and human intervention when errors occur.  OpenCV is one of the most popular library for computer vision.

Every Machine Learning algorithm takes a Dataset as input and learns from this data. The algorithm goes through the data and identifies patterns in the data.  Deep learning algorithms, on the other hand, learn about the task at hand through a network of neurons that map the task as a hierarchy of concepts.

Gray-scale image buffer which stores  image . Each pixel’s brightness is represented by a single 8-bit number, whose range is from 0 (black) to 255 (white)


Deep learning algorithms also perform better when more data is given, which is not typical of machine learning algorithms.   The best applications of Google's Tensorflow are the best for deep learning in general. Deep Learning is great at pattern recognition/machine perception.  It helps classify and cluster data with greater accuracy.  A certain group of pixels may signify an edge in an image or some other pattern. Convolutions use this to help identify images.

OpenCV is one of the most popular library for computer vision

C:\python -m pip install opencv-python







The idea is to provide different faces and allow machine to learn.

Machine learning is to train Algorithms to Learn Patterns and make Predictions from DATA to let computers to automate decision-making processes.

A classifier is trained with a few hundred sample views of a particular object (i.e., a face or an object), called positive examples, that are scaled to the same size (say, 20x20), and negative examples - arbitrary images of the same size.

A Haar Cascade is   a classifier  is used to detect particular objects from the source. The haarcascade_frontalface_default.xml is a haar cascade designed by OpenCV to detect the frontal face.
The Haar Cascade is by superimposing the positive image over a set of negative images. Haar-features are good at detecting edges and lines.


# training.py
# Ver 1.02
# 3rd Nov 2018
# Import OpenCV2 for image processing
# Import os for file path
#

import cv2, os

# Import numpy for matrix calculation
import numpy as np

# Import Python Image Library (PIL)
from PIL import Image

import os

def assure_path_exists(path):
    dir = os.path.dirname(path)
    if not os.path.exists(dir):
        os.makedirs(dir)

# Create Local Binary Patterns Histograms for face recognition
recognizer = cv2.face.LBPHFaceRecognizer_create()

# Using prebuilt frontal face training model, for face detection
detector = cv2.CascadeClassifier("haarcascade_frontalface_default.xml");

# Create method to get the images and label data
def getImagesAndLabels(path):

    # Get all file path
    imagePaths = [os.path.join(path,f) for f in os.listdir(path)]
 
    # Initialize empty face sample
    faceSamples=[]
 
    # Initialize empty id
    ids = []
    print("\nTraining started")
    # Loop all the file path
    for imagePath in imagePaths:

        # Get the image and convert it to grayscale
        PIL_img = Image.open(imagePath).convert('L')

        # PIL image to numpy array
        img_numpy = np.array(PIL_img,'uint8')

        # Get the image id
        id = int(os.path.split(imagePath)[-1].split(".")[1])

        print(id)
     
        # Get the face from the training images
        faces = detector.detectMultiScale(img_numpy)

        # Loop for each face, append to their respective ID
        for (x,y,w,h) in faces:

            # Add the image to face samples
            faceSamples.append(img_numpy[y:y+h,x:x+w])

            # Add the ID to IDs
            ids.append(id)

    # Pass the face array and IDs array
    return faceSamples,ids


# Get the faces and IDs
faces,ids = getImagesAndLabels('dataset')

# Train the model using the faces and IDs
recognizer.train(faces, np.array(ids))

# Save the model into trainer.yml
assure_path_exists('trainer/')
recognizer.save('trainer/trainer.yml')
print("\nTraining completed")

As we can see in the above example, accuracy can be misleading. Sometimes it may be desirable to select a model with a lower accuracy because it has a greater predictive power on the problem.

Accuracy is one metric for evaluating classification models.  For the above code, I used only 30 images using my webcam.  Precision and recall are two  model evaluation metrics. While precision refers to the percentage of results which are relevant, recall refers to the percentage of total relevant results correctly classified by the algorithm. Unfortunately, it is not possible to maximize both these metrics at the same time, as one comes at the cost of another

And a final touch on Image enhancement:

#Enhancing an image in Pillow using ImageFilter

from PIL import Image, ImageFilter
#Read image
im = Image.open( 'old.jpg' )
#Display image
im.show()
from PIL import ImageEnhance
enh = ImageEnhance.Contrast(im)
enh.enhance(1.8).show("10% more contrast")
enh.enhance(1.8).show("20% more contrast")
enh.enhance(1.8).show("25% more contrast")
enh.enhance(1.8).show("30% more contrast")
enh.enhance(1.8).show("35% more contrast")

Google s AutoML service dramatically reduces the steps involved in training and tuning a machine learning model. AutoML on Google Cloud is available for translation, natural language, and vision.

We are still not even close to solving computer vision. However, there are already multiple  enterprises that have found ways to apply CV systems, powered by CNNs to real-world problems. And this trend is not likely to stop anytime soon.

GAN "generative adversarial networks" can be used to generate photo-realistic images, reconstruct damaged images and remove blurring.  With these new techniques and rapidly improving capabilities, computer vision (CV) is progressing toward solving certain security challenges.  AI could watch cameras, monitor people and patterns and look for indicators of security concern. AI could systematically search a field of view for objects of interest and look for anomalies.  With more active researchers in the field, we can expect to see far more accurate and reliable Computer vision in the near future.







Popular posts from this blog

AI and the Insurance Industry

  Hari Nair Needing insurance is like needing a parachute. If it isn't there the first time, chances are you won't be needing it again. --Author unknown Building customer relationships and managing risks are key for Insurance companies. Insurance companies are making extensive use of AI  are reaping the benefits of increased customer satisfaction adding to their bottom line.  AI has the potential to transform the insurance experience for customers from frustrating and bureaucratic to something fast, on-demand, and more affordable. Tailor-made insurance products will attract more customers at fairer prices. If insurers apply AI tech to the mountain of data at their disposal, we will soon start to see more flexible insurance such as on-demand pay-as-you-go insurance, and premiums that automatically adjust in response to accidents, customer health, etc.   Insurers have yet to unlock the full potential of AI. Machine learning use cases in Insu...

A quick introduction to AI and ML

Hari Nair Artificial intelligence (AI) is a branch of computer science. Its main goal is to create smart machines that can learn on their own and are capable of thinking like humans. The term 'artificial intelligence' commonly applies to devices or applications capable of carrying out specific tasks in human ways, by mimicking cognitive functions such as: • learning • reasoning • problem-solving • visual perception • language-understanding There are two main types of AI: Applied AI - is more common and includes systems designed to intelligently carry out a single task, example: move a driverless vehicle. This category is also known as 'weak' or 'narrow' AI. Generalized AI - is less common and includes systems or devices that can theoretically handle any task, as they carry enough intelligence to find solutions to unfamiliar problems. Generalized AI is also known as 'strong' AI. Examples of true strong AI don't curren...

Digital transformation in BFSI

Hari Nair It is not enough to just import technologies like AI, Block-chain or smartphones into existing financial services, says futurist and Fin-tech entrepreneur  --Brett King. The two biggest issues facing the majority of Bank customers today are service delays and poor quality of personalization. Now that we have chat-bots that have become more and more intelligent every year with conversational interface design, personal banking is significantly improved. AI is bringing upon a digital revolution to banking.  Fin-tech is an industry aiming to disrupt financial services using Artificial Intelligence.  By reducing waiting time, the bank can get-rid-of  long ques and help customers get personalized services quicker. AI has the potential to eliminate human error in banking procedures, allowing banks to better understand customer demands, make credit cards extinct, and influence the attraction of the unbanked to financial services. Fintech is ...