15:51 pm
Recently I faced an Issue in Optimum Library where I wanted to use the swin2SR model which is basically an Image Upscaler model, but didn’t work. So I created an Issue in HuggingFace’s Optimum Library. i.e (https://github.com/huggingface/optimum/issues/2030)
Issue
Feature request
Optimum cli has an option to export “image-to-image” task. (https://huggingface.co/docs/optimum/en/exporters/onnx/overview)
However optimum doesn’t support task type “image-to-image” for ONNX runtime. There needs to be an OrtModelForImage2Image for SuperResolution or Denoising or other similar tasktypes.
Motivation
It really is a mess to figure out how onnxruntime works. And I’m really used to the transformers and optimums pipeline features. But it really was a mess to learn the manual io bindings output_shape mismatch and all stuff for gpu inference. I want to pipeline it in a similar way.
When I tried
from optimum.pipelines import pipeline
pipe = pipeline("image-to-image")
It gives ValueError
ValueError: Task image-to-image is not supported for the ONNX Runtime pipeline. Supported tasks are ['feature-extraction', 'fill-mask', 'image-classification', 'image-segmentation', 'question-answering', 'text-classification', 'text-generation', 'token-classification', 'zero-shot-classification', 'summarization', 'translation', 'text2text-generation', 'automatic-speech-recognition', 'image-to-text', 'audio-classification']
However exporting to onnx works
optimum-cli export onnx -m caidas/swin2SR-realworld-sr-x4-64-bsrgan-psnr swin2sr-4x-onnx
So, I needed to integrate image-to-image pipeline to this library.
My Contribution
I created a PR that add image-to-image pipeline to optimum library.
Key Changes:
- New task image-to-image added
- Added ORTModelForImage2Image
- Updated docs and Added example usage
- Added Tests
Overview
This pull request introduces support for image-to-image tasks using the ORTModelForImage2Image
within the ORT pipeline of the optimum library. This enhancement allows transformer-based models, such as Swin, to perform image-to-image tasks, extending the library’s capabilities beyond just diffusion models.
Key Changes
New Pipeline Support
- Implemented the image-to-image task in the optimum pipeline.
- Integrated ORTModelForImage2Image for this task.
Example Usage
from transformers import AutoImageProcessor
from optimum.pipelines import pipeline
from optimum.onnxruntime import ORTModelForImageToImage
from PIL import Image
model = ORTModelForImageToImage.from_pretrained("swin2sr-2x-onnx", use_io_binding=True)
processor = AutoImageProcessor.from_pretrained("caidas/swin2SR-classical-sr-x2-64")
onnx = pipeline("image-to-image", model=model, feature_extractor=processor, device="cuda")
image_path = "image.png"
image = Image.open(image_path)
output = onnx(image)
Documentation
- Updated the documentation to include the new image-to-image task.
- Provided examples and usage instructions for the new functionality.
Testing
- Added tests to ensure the new image-to-image task works as expected.
- Verified that the tests cover various aspects of the new functionality.
Important File Changes
-
optimum/pipelines/image_to_image.py
:- Introduction of the new image-to-image task pipeline.
- Integration with
ORTModelForImage2Image
.
-
tests/test_image_to_image.py
:- Added tests for the new image-to-image task.
- Ensured comprehensive test coverage.
-
docs/image_to_image.md
:- Updated documentation to reflect the new task.
- Included detailed usage examples and instructions.
Note
It appears that I encountered issues retrieving the exact file changes. Please refer to the pull request here for detailed file modifications and review the changes directly.
Links : TODO
Tags :
Date : 16th March, Sunday, 2025, (Wikilinks: 16th March, March 25, March, 2025. Sunday)
Category : Others