15:51 pm

Recently I faced an Issue in Optimum Library where I wanted to use the swin2SR model which is basically an Image Upscaler model, but didn’t work. So I created an Issue in HuggingFace’s Optimum Library. i.e (https://github.com/huggingface/optimum/issues/2030)

Issue

Feature request

Optimum cli has an option to export “image-to-image” task. (https://huggingface.co/docs/optimum/en/exporters/onnx/overview)
However optimum doesn’t support task type “image-to-image” for ONNX runtime. There needs to be an OrtModelForImage2Image for SuperResolution or Denoising or other similar tasktypes.

Motivation

It really is a mess to figure out how onnxruntime works. And I’m really used to the transformers and optimums pipeline features. But it really was a mess to learn the manual io bindings output_shape mismatch and all stuff for gpu inference. I want to pipeline it in a similar way.

When I tried

from optimum.pipelines import pipeline
pipe = pipeline("image-to-image")

It gives ValueError

ValueError: Task image-to-image is not supported for the ONNX Runtime pipeline. Supported tasks are ['feature-extraction', 'fill-mask', 'image-classification', 'image-segmentation', 'question-answering', 'text-classification', 'text-generation', 'token-classification', 'zero-shot-classification', 'summarization', 'translation', 'text2text-generation', 'automatic-speech-recognition', 'image-to-text', 'audio-classification']

However exporting to onnx works

optimum-cli export onnx -m caidas/swin2SR-realworld-sr-x4-64-bsrgan-psnr swin2sr-4x-onnx

So, I needed to integrate image-to-image pipeline to this library.

My Contribution

I created a PR that add image-to-image pipeline to optimum library.

Key Changes:

Overview

This pull request introduces support for image-to-image tasks using the ORTModelForImage2Image within the ORT pipeline of the optimum library. This enhancement allows transformer-based models, such as Swin, to perform image-to-image tasks, extending the library’s capabilities beyond just diffusion models.

Key Changes

New Pipeline Support

  • Implemented the image-to-image task in the optimum pipeline.
  • Integrated ORTModelForImage2Image for this task.

Example Usage

from transformers import AutoImageProcessor
from optimum.pipelines import pipeline
from optimum.onnxruntime import ORTModelForImageToImage
from PIL import Image
 
model = ORTModelForImageToImage.from_pretrained("swin2sr-2x-onnx", use_io_binding=True)
processor = AutoImageProcessor.from_pretrained("caidas/swin2SR-classical-sr-x2-64")
onnx = pipeline("image-to-image", model=model, feature_extractor=processor, device="cuda")
image_path = "image.png"
image = Image.open(image_path)
output = onnx(image)

Documentation

  • Updated the documentation to include the new image-to-image task.
  • Provided examples and usage instructions for the new functionality.

Testing

  • Added tests to ensure the new image-to-image task works as expected.
  • Verified that the tests cover various aspects of the new functionality.

Important File Changes

  1. optimum/pipelines/image_to_image.py:

    • Introduction of the new image-to-image task pipeline.
    • Integration with ORTModelForImage2Image.
  2. tests/test_image_to_image.py:

    • Added tests for the new image-to-image task.
    • Ensured comprehensive test coverage.
  3. docs/image_to_image.md:

    • Updated documentation to reflect the new task.
    • Included detailed usage examples and instructions.

Note

It appears that I encountered issues retrieving the exact file changes. Please refer to the pull request here for detailed file modifications and review the changes directly.


Links : TODO

Tags :

Date : 16th March, Sunday, 2025, (Wikilinks: 16th March, March 25, March, 2025. Sunday)

Category : Others