You have to be logged in to leave a comment.

Mayo_Stroke_Blood_Clot_Origin

The goal of this project is to classify the blood clot origins in ischemic stroke. Using whole slide digital pathology images, you'll build a model that differentiates between the two major acute ischemic stroke (AIS) etiology subtypes: cardiac and large artery atherosclerosis.

The model is to be trained on Azure using Data from the repo using streaming client

This repo contains the data and code files for the kaggle competition https://www.kaggle.com/competitions/mayo-clinic-strip-ai.

Tools Used

Dagshub Streaming Client
Azure ML-SDK

About the Data

The data originally present for the challenge is in tiff format and is 356GB in size. For the purpose of this analysis, we downscale the images and store them in png format. The code for this downscaling of data can be found in this repo in data/raw/train/ folder. The downscaled images can be found here. To map the images to the classes they belong to, the competition data provides us with a CSV file that contains information about the data which can be found in data/raw/train.csv

In addition this repo, also contains the Pretrained EfficientNet Models which can be found in data/raw/Pretrained_Efficient_Models/ folder

Setting Up the Environment

Create a Virtual Environment with python 3.9.13.

conda create -n azure_ml python=3.9.13

Install the following libraries in the Environment

conda activate azure_ml  [Goes into conda Environment ]
pip install azureml-sdk[notebooks,tensorboard,interpret]

Clone the Repo
The code can be found in notebooks/01-aiswarya-ramachandran-connecting-to-azure-ml-sdk.ipynb . This notebooks contains the complete code and internally creates the scripts needed to train the model on Azure.

Steps Involved

Install the Dagshub Client
Create a Workspace on Azure
Create a GPU Compute on which we want to run the training
Creating the training scripts and put them all into a folder so that they can be pushed to Azure. The training scripts, read the data from Dagshub Streaming client as the data is not pushed to Azure. To do this the scripts must include a step to clone the git repo on Azure.
Create an Environment which contains the list of conda and pip dependencies that need to be installed on Azure.
Create an experiment and a Run config which takes all the information that is needed for the training to run on Azure
Monitor the metrics

Article describing this entire process can be found at https://hackernoon.com/image-classification-on-azure-with-dagshub-direct-data-access

Tip!

Press p or to see the previous file or, n or to see the next file

README.md 2.7 KB

Permalink History Raw

Mayo_Stroke_Blood_Clot_Origin

Tools Used

About the Data

Setting Up the Environment

Steps Involved

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

AiswaryaSrinivas / Mayo_Stroke_Blood_Clot_Origin

README.md 2.7 KB Permalink History Raw

Mayo_Stroke_Blood_Clot_Origin

Tools Used

About the Data

Setting Up the Environment

Steps Involved

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

AiswaryaSrinivas
/
Mayo_Stroke_Blood_Clot_Origin

README.md 2.7 KB

Permalink History Raw