Are you sure you want to delete this access key?
ResNet Winner of the 2015 ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
Ivy | ResNet Prediction |
---|---|
![]() |
Ivy is a Golden Retriever. I am 96.0% sure of this. I am ResNet. |
This pretrained deep learning network predicts dog breeds even better than AlexNet.
You can find the latest version of this work in my Git Repo on DagsHub.
I am writing this during my review of PyTorch where I am mostly using Deep Learning with PyTorch and also exploring other educational materials for PyTorch too.
I highly recommend Deep Learning with PyTorch. It is amazingly well written and well organized. While it is not my only source for reviewing PyTorch, it is my central source.
To benefit the quickest possible way from PyTorch, we want to use pretrained models. Predefined models are available in torchvision.models
. See the AlexNet.md to learn more about exploring the PyTorch models in the torchvision models.
Import stuff - nuff said.
import torch
from torchvision import models
from torchvision import transforms
import urllib
import os
from PIL import Image
We load the pretrained instance of ResNet that has 101 layers. Most of these layers are convolutional layers.
# That next command took 35 seconds for my little machine to load!
resnet = models.resnet101(pretrained=True)
If you'd like to see a ton of information about this instance of ResNet, just uncomment the print statement below in ResNet.py
.
# To see details about the architecture ...
# print(resnet)
All machine learning problems are math problems. The numbers of the image file need to be conditioned in a consistent way before they are fed to resnet100. PyTorch is awesome in the way that it provides many helpful routines to condition image tensor data into the conditions we need. In fact, the PyTorch method calls below are so clear as to what they are doing, we don't even need go over them. You can see the AlexNet.md to learn more about this code block. It was explained in more detail there.
For those of you that don't understand the last method - transforms.Normalize, do take the time to learn what image and dataset normalization is and how you would use the mean and standard deviation in such processes.
# The preprocess function below returns a tensor
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])])
Let's go get an image for Bobby, the Golden Retriever, from the web ... ONE TIME!
# No need to reload the bobby dog image every time.
file_name = "bobby.jpg"
if not os.path.exists("bobby.jpg"):
print("\nSaving the bobby image from the net.")
url = "https://drek4537l1klr.cloudfront.net/stevens2"
url += "/v-11/Figures/p1ch2_bobby.jpg"
try:
urllib.URLopener().retrieve(url, file_name)
except:
urllib.request.urlretrieve(url, file_name)
else:
print("\nUsing the existing bobby.jpg image.")
Next, we simply use the Image class from the Python Imaging Library to open the image as a data object type. That object can be fed directly to the preprocess function we defined above. The we convert that data object to a tensor, and then we unsqueeze that tensor. Please see the PyTorch documentation for what unsqueeze does.
img = Image.open(file_name) # img.show() IF you want to see bobby.jpg
img_tensor = preprocess(img)
batch_tensor = torch.unsqueeze(img_tensor, 0)
Let's remember that our resnet inference in an instance of a class. Before we use it to make inferences (we give it an image and resnet infers what dog breed is in the image), we must put it in eval mode so that it CAN do inferences on images. Now we can feed resnet the tensor data of the image and assign the inference to out.
resnet.eval()
out = resnet(batch_tensor) # print(out) to see resnet out output
The data in out is large, and we need to post process it quite a bit to talk human to us, so let's get started on that. One thing we need to obtain, ONLY ONCE, is the imagenet classes text file that relates numbers to names of dog breeds.
# No need to reload imagenet classes text file each time
if not os.path.exists("imagenet_classes.txt"):
print("\nObtaining the imagenet classes text file.\n")
keys = "https://raw.githubusercontent.com/pytorch/"
keys += "hub/master/imagenet_classes.txt"
os.system(f'wget {keys}')
else:
print("\nThe imagenet classes text file exists.\n")
Once we have the imagenet classes text file, we can obtain the labels.
with open('imagenet_classes.txt') as f:
labels = [line.strip() for line in f.readlines()]
Next, we find the index value for the prediction that had the maximum confidence. We provide that confidence as a percentage and also get the corresponding dog breed label.
_, index = torch.max(out, 1)
percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
top_label = labels[index[0]]
percent_conf = f'{round(percentage[index[0]].item(), 2)}%'
print(f'Bobby is a {top_label}. I am {percent_conf} of this. I am ResNet.\n')
But hey, we might want to see the top n predictions too. No problem. Let's sort the out object.
_, indices = torch.sort(out, descending=True)
Now we can use a slick list comprehension to provide labels and their confidence levels.
print("Here is more of my analyses.")
[print(f'\t{(labels[idx], round(percentage[idx].item(), 2))}%')
for idx in indices[0][:5]]
print()
All of this was done slightly different in the AlexNet.md work, which was the original step in this PyTorch journey.
We've loaded the image of Bobby the Golden Retriever. Let's see if ResNet gives a higher confidence prediction that Bobby is a Golden Retriever
Bobby | ResNet Results |
---|---|
![]() |
Golden Retriever 96.73% Certainty My Other Guesses: Labrador Retriever, 2.55% Cocker Spaniel, 0.21% Redbone, 0.17% Tennis Ball, 0.1% |
Wow! This is even stronger than AlexNet was, but if you've read AlexNet.md, you'll remember that I tested AlexNet on my dogs too. And two of my dogs are challenging. Let's see how well ResNet does compared to AlexNet.
But first, let's go over the code that we'll use to have ResNet predict the breeds of Thom's dogs. The code below is simply a more compact version of the previous code, but this block of code in the for loop will not repeat operations that do not need repeating.
print("\nPredictions for Thom's Dogs\n")
for file_name in ("Ambrose.jpg", "Ivy.jpg", "Caleb.jpg",
"Sadie_1.jpg", "Sadie_2.jpg"):
img = Image.open(file_name) # img.show() IF you want to see bobby.jpg
img_tensor = preprocess(img)
batch_tensor = torch.unsqueeze(img_tensor, 0)
out = resnet(batch_tensor) # print(out) to see resnet out output
_, index = torch.max(out, 1)
percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
top_label = labels[index[0]]
percent_conf = f'{round(percentage[index[0]].item(), 2)}%'
dog = file_name.split(".")[0]
print(f'{dog} is a {top_label}. I am {percent_conf} of this.')
_, indices = torch.sort(out, descending=True)
print("Here are more of my analyses.")
[print(f'\t{(labels[idx], round(percentage[idx].item(), 2))}%')
for idx in indices[0][:5]]
print()
The above code look at images for my 4 dogs, with one being at two angles.
Dog Name | Breed |
---|---|
Ambrose | Mostly Border Collie (1/8th Australian Shepherd) |
Ivy | "Cream Colored" Golden Retriever (Thought she might be a challenge) |
Caleb | Old English Mastiff (NOT AKC - He is shaped more like a Great Dane and is brindle coated) |
Sadie | Havanese (Expected issues here too due to being a rare breed) |
The last for loop ResNet.py
are just like the first for loop, but we are using the last for loop for additional classifications.
The following explanations of my dogs exactly match those in AlexNet.md. Only the results have changed in accordance with ResNet's predictions.
Ambrose is the smartest dog BY FAR that I've ever had. I could tell stories. AND I promise I don't beat him. He is being camera shy with his ears down.
Ambrose | ResNet Results |
---|---|
![]() |
Border Collie 61.17% Certainty My Other Guesses: 'Border collie', 61.17% Collie, 18.52% Japanese Spaniel, 6.62% Bernese Mountain Dog, 4.08% Blenheim Spaniel, 3.76% |
Correct and MUCH STRONGER than good ole AlexNet! Even the other predictions are better this time.
Ivy is also smart, BUT she's far more defiant than Ambrose, and she is not as concerned to please. Actually, for her 1 year age, she is very obedient and wants to please. It's just that Ambrose is off the charts good in those areas.
Ivy | ResNet Results |
---|---|
![]() |
Golden Retriever 96.0% Certainty My Other Guesses: Kuvasz, 1.11% Doormat, 0.92% Labrador Retriever, 0.7% Cocker Spaniel, 0.2% |
Correct and MUCH STRONGER than AlexNet again! The other ones are so low, they are not even worth investigating. However, the doormat prediction is interesting. Golden Retrievers sleeping on doormat images?
Caleb is a noble creature, and if you are wanting to hug him, get in line. He is HUGE! However, since he is not AKC, and his parents were shaped quite a bit like Great Danes, he looks a bit like a Great Dane. I was actually expecting AlexNet to predict this. Also, for those of you that knew and loved Caleb too, he is now gone from this world. He is waiting for me in a better place. I will be glad when I see him again.
Caleb | ResNet Results |
---|---|
![]() |
Great Dane 71.9% Certainty My Other Guesses: German Short-Haired Pointer, 7.04% Bull Mastiff, 6.47% Irish wolfhound, 2.49% Bouvier des Flandres, 2.12% |
Oooo ResNet? Very good! He actually does look more like a Great Dane in our family's opinion too! The other predictions are really good too. Thank God there wasn't a top 5 confidence for Hyena this time!
What does a Great Dane look like? What does a more standard Old English Mastiff look like? ResNet, Our family can completely sympathize with your guess. And THANKS for not guessing HYENA in your top 5?! Thanks ResNet, and thanks AlexNet for allowing ResNet to stand on your shoulders!
Great Dane | Old English Mastiff |
---|---|
![]() |
![]() |
Sadie is the oldest dog, still bounces around like a puppy, and is almost deaf, but she is still super sweet and kind. I suspected her to be as challenging to ResNet as she was to AlexNet.
Sadie Angle One | Sadie Angle Two |
---|---|
![]() |
![]() |
ResNet results: Briard, 49.65% Tibetan Terrier, 47.12% Lhasa, 0.89% Giant Schnauzer, 0.35% Soft-Coated Wheaten Terrier, 0.24% |
ResNet results: Silky Terrier, 33.56% Lhasa, 19.0% Briard, 16.35% Affenpinscher, 5.45% Cairn, 4.62% |
When you look at images for some of these other guesses, and you consider that ResNet does not have a scale understanding of these images, these are actually very good. A black Briard looks A LOT like Sadie! And these guesses are an improvement over AlexNet in my opinion.
Again, look at the second loop and the related images to see some other analyses with ResNet.
ResNet - YOU ROCK! Thanks for not thinking any of my dogs might be a Hyena this time.
I hope this helped you to get a good second step in your start with PyTorch when using predefined models. Why would we want to review PyTorch? For many reasons. I'll share more about why along this journey. Until then, I hope you can continue to follow this Git Repo on DagsHub while I grow these PyTorch exercises.
Until next time.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?