Why do you need a face mask detector?

Challenges Under the Pandemic

The United States is the country with the most amount of cases recorded of SARS-COV19. Even though the CDC recommends that the public wear a mask when in close contact with others to prevent further spread, many individuals are still not following protocol. Many businesses are taking a hefty loss throughout these times while also being the ones that must enforce these temporary mask policies for the safety of the employees and customers to protect what business they have left. Although its one thing to wear a mask, its also important to be able to distinguish between wearing a mask correctly or not to maximize the prevention of either catching or spreading the disease further.

How do we resolve this challenge?

In response to this challenge, we created a face mask detector that would not only detect the presence of a mask but also evaluate if the person is wearing the mask correctly! This face mask detector could help business owners monitor and ensure customers wearing a mask at all time inside the store. More importantly, it could help to protect you, your employees, and your customers' safety. Please read more to learn about our face mask detector.

Dataset: MaskedFace-Net

Our training dataset makes our model unique. Prior to developing our face mask detector, we researched many existing face mask detectors developed by other teams. We found that a common disadvantage between these models was a lack of sufficient training data. Moreover, many face mask detectors are restricted to binary classification since their dataset only contains "masked" and "not masked" images. A more diverse dataset that contains images of people wearing a mask, incorrectly wearing a mask, and not wearing a mask at all is necessary for us to achieve our goal. We overcame this issue by using the MaskedFace-Net dataset which was made available on Kaggle.

Click to access dataset

Sample Images

mask.jpg

Correctly Masked

incorrect_mask.jpg

Incorrectly Masked

Missing Mask

A Reliable Dataset Makes a Good Detector

In this dataset, we have over 50,000 headshot images broken into three categories including people wearing a mask correctly, incorrectly, or not at all. These 50,000 images are then split into three datasets, training, validation, and test, which is included in a breakdown of our dataset seen below. The breakdown also reveals that each of these categories are evenly distributed being roughly 1/3rd of each dataset respectively.Having a large dataset which is also labelled to deal with the problem at hand and also fairly evenly distributed made this dataset stand out much more than other datasets.

data_breakdown.png

Dataset Breakdown

train_breakdown.png

Training Set Breakdown

Validation Set Breakdown

Testing Set Breakdown

Our Product

Convolutional Neural Network

Our face mask detector uses a pre-trained ResNet50 model, which is a type of convolutional neural network that is 50 layers deep. ResNet50 is a powerful neural network that is able to classify images into categories. The structure of ResNet50 is made up of two parts: the first part is feature extraction, where the model learns the features in an image, and the second part is classification, where the model makes a prediction based on the features it has learned during the first phase. In our case, we fine-tuned a pre-trained ResNet50 for our task. This means that during training, we don’t change how the model learns the features. Instead, we only train how the model makes its prediction. We believe this is a feasible approach to achieving high accuracy.

Grad-CAM

Grad-CAM is a type of Explainable AI (XAI) method that aims to identify what features a model relies on in predicting an output. The output of Grad-CAM is a heatmap that highlights the regions that the model believes is important. The heatmap is generated by retrieving information from the last convolutional layer. The information is also known as feature maps, which represent learned features and are important for the model to predict outputs. The feature maps are passed into the classification layer to determine a weight for each feature. Then, this importance is visualized by calculating the weighted sum of each feature and through an activation function, ReLU.

Integrated Gradient

We also use Integrated Gradient to ensure our model is behaving the way we want it to. Integrated Gradient is another visualizing method that will assign a score to each feature given an input and a neural network function. Below is a formula for obtaining integrated gradient:

We can divide the formula into 3 steps:
1. Compute the gradient of function output with respect to feature I
2. Integrated over the gradients to avoid saturation problem (meaning some features high have small gradients even if they are important)
3. Multiply the difference between baseline input (which is a blank or black image used to represent the absence of feature) and original input to get the feature importance score, which is integrated gradient.

Result

Under CNN Model?

After training our model on the training dataset, we then trained the model on our Validation set and found an accuracy measurement of 95%. We also then tried retraining our model onto the training dataset alongside training the model onto the test datasets and discovered accuracy measurements of 88% and 95% respectively. Although its slightly concerning that our model performed worse when retraining on itself (primarily due to overfitting errors, which we address in the report) our findings reveal that our model performs fairly well on unseen data (both the Validation and Test sets).

Under Grad-CAM?

Since Grad-CAM is purely for visualization purposes, we decided to provide an example case when running the model on the validation set. We will briefly go over each image, but if you’re more curious, check out our report where we go over this algorithm in more detail. The first image is a heatmap generated by Grad-CAM. The mask is the class/object of interest in our model and although Grad-CAM doesn’t provide us labels, this image is most likely to be classified as correctly worn due to the full coverage of the face. The second image is generated with Grad-CAM and Guided Backpropagation. What we see in this image are the highlighted pixels of the original image of our model alongside some extra highlighting due to the findings of Grad-CAM, which reveals some importance to the head, but has the most important pixels around the mask. The last image is generated solely with Guided Backpropagation. This image reveals the areas of interest through highlighting the pixels of the original image in this altered variation, which in this case reveals a headshot.

Under Integrated Gradient?

Likewise, since Integrated Gradients is purely for visualization purposes, we decided to provide an example case when running the model on the validation set. Again, we will briefly go over each image, but if you’re more curious, check out our report where we go over this algorithm in more detail. In this example, we see that Integrated Gradients reveals that there are two important individual features/objects, the first being a mask and the second being a face. The mask is relatively clearer in the second image, mostly because of the importance of highlighting the entire face when detecting it as the object.

Face Mask Detector and Visualization Tool Kit Demo

We prepared three demos for you in this section. Please click on the images below to see the associated ouputs.

Input Image

Original Image

Grad-CAM Output

Classification: ???

Integrated Gradient Output

Conclusion

Our product addressed a current concern shared by people across the world and point out a common bad habit practiced by the general public, which is wearing mask incorrectly and has jeopardized a lot of business. We attempted to mitigate this issue by creating a face mask detector that will help business owners comply with laws and ensure the survival of their business. Moreover, we created the face mask detector to protect business owners, employees, and customers' safety and health during the pandemic. Our approach to building such a face mask detector is successful and is proven to work in the intended way. We hope that more face mask detectors can be implemented so that business owner can survive in this rather difficult time and that people can protect not only themselves but also others.

User Manual

If you're interested in our product, please visit our github repository.
You can access the Face Mask Detector and Visualization Tool Kit by cloning this repository.
Please follow instructions under the Readme section to set up the working enviornment.
Please contact us if you encountered any issues with installation.
Thank you so much for your interest in our product! Looking forward to hearing your feedback!

GitHub

About Us

Gavin Tran

UCSD 4th Year Data Science

Contact: gatran@ucsd.edu

Athena Liu

UCSD 4th Year Data Science

Contact: atl074@ucsd.edu

Jerry Lin

UCSD 4th Year Data Science

Contact: chl820@ucsd.edu