PyTorch machine studying fashions on Android

Posted by Paul Ruiz – Senior Developer Relations Engineer

Earlier this yr we launched Google AI Edge, a collection of instruments with quick access to ready-to-use ML duties, frameworks that allow you to construct ML pipelines, and run common LLMs and customized fashions – all on-device. For AI on Android Highlight Week, the Google group is highlighting varied ways in which Android builders can use machine studying to assist enhance their functions.

On this submit, we’ll dive into Google AI Edge Torch, which lets you convert PyTorch fashions to run domestically on Android and different platforms, utilizing the Google AI Edge LiteRT (previously TensorFlow Lite) and MediaPipe Duties libraries. For insights on different highly effective instruments, you should definitely discover the remainder of the AI on Android Highlight Week content material.

To get began with Google AI Edge simpler, we have offered samples accessible on GitHub as an executable codelab. They show easy methods to convert the MobileViT mannequin for picture classification (appropriate with MediaPipe Duties) and the DIS mannequin for segmentation (appropriate with LiteRT).

a red Android figurine is shown next to a black and white silhouette of the same figure, labeled 'Original Image' and 'PT Mask' respectively, demonstrating image segmentation.

DIS mannequin output

This weblog guides you thru easy methods to use the MobileViT mannequin with MediaPipe Duties. Needless to say the LiteRT runtime offers related capabilities, enabling you to construct customized pipelines and options.

Convert MobileViT mannequin for picture classification appropriate with MediaPipe Duties

As soon as you’ve got put in the required dependencies and utilities on your app, step one is to retrieve the PyTorch mannequin you want to convert, together with every other MobileViT parts you may want (corresponding to a picture processor for testing).

from transformers import MobileViTImageProcessor, MobileViTForImageClassification

hf_model_path="apple/mobilevit-small"
processor = MobileViTImageProcessor.from_pretrained(hf_model_path)
pt_model = MobileViTForImageClassification.from_pretrained(hf_model_path)

Because the finish results of this tutorial ought to work with MediaPipe Duties, take an additional step to match the anticipated enter and output shapes for picture classification to what’s utilized by the MediaPipe picture classification Activity.

class HF2MP_ImageClassificationModelWrapper(nn.Module):

  def __init__(self, hf_image_classification_model, hf_processor):
    tremendous().__init__()
    self.mannequin = hf_image_classification_model
    if hf_processor.do_rescale:
      self.rescale_factor = hf_processor.rescale_factor
    else:
      self.rescale_factor = 1.0

  def ahead(self, picture: torch.Tensor):
    # BHWC -> BCHW.
    picture = picture.permute(0, 3, 1, 2)
    # RGB -> BGR.
    picture = picture.flip(dims=(1,))
    # Scale [0, 255] -> [0, 1].
    picture = picture * self.rescale_factor
    logits = self.mannequin(pixel_values=picture).logits  # [B, 1000] float32.
    # Softmax is required for MediaPipe classification mannequin.
    logits = torch.nn.purposeful.softmax(logits, dim=-1)

    return logits

hf_model_path="apple/mobilevit-small"
hf_mobile_vit_processor = MobileViTImageProcessor.from_pretrained(hf_model_path)
hf_mobile_vit_model = MobileViTForImageClassification.from_pretrained(hf_model_path)
wrapped_pt_model = HF2MP_ImageClassificationModelWrapper(
hf_mobile_vit_model, hf_mobile_vit_processor).eval()

Whether or not you propose to make use of the transformed MobileViT mannequin with MediaPipe Duties or LiteRT, the following step is to transform the mannequin to the .tflite format.

First, match the enter form. On this instance, the enter form is 1, 256, 256, 3 for a 256×256 pixel three-channel RGB picture.

Then, name AI Edge Torch’s convert operate to finish the conversion course of.

import ai_edge_torch

sample_args = (torch.rand((1, 256, 256, 3)),)
edge_model = ai_edge_torch.convert(wrapped_pt_model, sample_args)

After changing the mannequin, you may additional refine it by incorporating metadata for the picture classification labels. MediaPipe Duties will make the most of this metadata to show or return pertinent info after classification.

from mediapipe.duties.python.metadata.metadata_writers import image_classifier
from mediapipe.duties.python.metadata.metadata_writers import metadata_writer
from mediapipe.duties.python.imaginative and prescient.image_classifier import ImageClassifier
from pathlib import Path

flatbuffer_file = Path('hf_mobile_vit_mp_image_classification_raw.tflite')
edge_model.export(flatbuffer_file)
tflite_model_buffer = flatbuffer_file.read_bytes()

//Extract the picture classification labels from the HF fashions for later integration into the TFLite mannequin.
labels = checklist(hf_mobile_vit_model.config.id2label.values())

author = image_classifier.MetadataWriter.create(
    tflite_model_buffer,
    input_norm_mean=[0.0], #  Normalization is not wanted for this mannequin.
    input_norm_std=[1.0],
    labels=metadata_writer.Labels().add(labels),
)
tflite_model_buffer, _ = author.populate()

With all of that accomplished, it is time to combine your mannequin into an Android app. In case you’re following the official Colab pocket book, this includes saving the mannequin domestically. For an instance of picture classification with MediaPipe Duties, discover the GitHub repository. You’ll find extra info within the official Google AI Edge documentation.

moving image of Newly converted ViT model with MediaPipe Tasks

Newly transformed ViT mannequin with MediaPipe Duties

After understanding easy methods to convert a easy picture classification mannequin, you should use the identical strategies to adapt varied PyTorch fashions for Google AI Edge LiteRT or MediaPipe Duties tooling on Android.

For additional mannequin optimization, contemplate strategies like quantizing throughout conversion. Try the GitHub instance to study extra about easy methods to convert a PyTorch picture segmentation mannequin to LiteRT and quantize it.

What’s Subsequent

To maintain updated on Google AI Edge developments, search for bulletins on the Google for Builders YouTube channel and weblog.

We look ahead to listening to about the way you’re utilizing these options in your initiatives. Use #AndroidAI hashtag to share your suggestions or what you’ve got in-built social media and take a look at different content material in AI on Android Highlight Week!

PyTorch machine studying fashions on Android

Convert MobileViT mannequin for picture classification appropriate with MediaPipe Duties

What’s Subsequent

Classes from Disney’s Strategy to Consumer-Centric Design

Offering a World IoT Connectivity Resolution

IoT Now Journal Q3 2024: The connection between IoT and mobile connectivity is SIMbiotic

The U.Ok.’s NCSC and U.S. FBI Warn of Iranian Spear-Phishing Assaults

Classes from Disney’s Strategy to Consumer-Centric Design

Offering a World IoT Connectivity Resolution

IoT Now Journal Q3 2024: The connection between IoT and mobile connectivity is SIMbiotic

The U.Ok.’s NCSC and U.S. FBI Warn of Iranian Spear-Phishing Assaults

LEAVE A REPLY Cancel reply

Editor Picks

Offering a World IoT Connectivity Resolution

IoT Now Journal Q3 2024: The connection between IoT and mobile connectivity is SIMbiotic

The U.Ok.’s NCSC and U.S. FBI Warn of Iranian Spear-Phishing Assaults

Must read

Offering a World IoT Connectivity Resolution

IoT Now Journal Q3 2024: The connection between IoT and mobile connectivity is SIMbiotic

The U.Ok.’s NCSC and U.S. FBI Warn of Iranian Spear-Phishing Assaults

Popular categories