Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
How can I export an encoder-decoder PyTorch model into a single ONNX file?
I converted the PyTorch model Helsinki-NLP/opus-mt-fr-en
(HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
import os
from optimum.onnxruntime import ORTModelForSeq2SeqLM
from transformers import AutoTokenizer, AutoConfig
hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
onnx_save_directory = "./onnx_model_fr_en"
os.makedirs(onnx_save_directory, exist_ok=True)
print(f"Starting conversion for model: {hf_model_id}")
print(f"ONNX model will be saved to: {onnx_save_directory}")
print("Loading tokenizer and config...")
tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
config = AutoConfig.from_pretrained(hf_model_id)
model = ORTModelForSeq2SeqLM.from_pretrained(
hf_model_id,
export=True,
from_transformers=True,
# Pass the loaded config explicitly during export
config=config
)
print("Saving ONNX model components, tokenizer and configuration...")
model.save_pretrained(onnx_save_directory)
tokenizer.save_pretrained(onnx_save_directory)
print("-" * 30)
print(f"Successfully converted '{hf_model_id}' to ONNX.")
print(f"Files saved in: {onnx_save_directory}")
if os.path.exists(onnx_save_directory):
print("Generated files:", os.listdir(onnx_save_directory))
else:
print("Warning: Save directory not found after saving.")
print("-" * 30)
print("Loading ONNX model and tokenizer for testing...")
onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
french_text= "je regarde la tele"
print(f"Input (French): {french_text}")
inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
print("Generating translation using the ONNX model...")
generated_ids = onnx_model.generate(**inputs)
english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(f"Output (English): {english_translation}")
print("--- Test complete ---")
The output folder containing the ONNX files is:
franck@server:~/tests/onnx_model_fr_en$ ls -la
total 860968
drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
-rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
-rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
-rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
-rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
-rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
-rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
-rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
-rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
-rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
-rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
Having several ONNX files is an issue because:
- The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the
encoder_model.onnx
anddecoder_model.onnx
, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size). - Having both a
decoder_model.onnx
anddecoder_with_past_model.onnx
duplicates many parameters.
The total size of the three ONNX files is:
-
decoder_model.onnx
: 346,250,804 bytes -
decoder_with_past_model.onnx
: 333,594,274 bytes -
encoder_model.onnx
: 198,711,098 bytes
Total size = 346,250,804 + 333,594,274 + 198,711,098 = 878,556,176 bytes. That’s approximately 837.57 MB, why is almost 3 times larger than the original PyTorch model (300 MB).
Crossposts:
- https://stackoverflow.com/q/79580250/395857
- https://old.reddit.com/r/mlops/comments/1k2c6tl/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/pytorch/comments/1k2c78e/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/learnmachinelearning/comments/1k2c6nw/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/huggingface/comments/1k2c8p2/how_can_i_export_an_encoderdecoder_huggingface/?ref=share&ref_source=link
- https://old.reddit.com/r/learnpython/comments/1k2c89a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/pythonhelp/comments/1k2c7wz/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/MachineLearning/comments/1k2c6ii/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/LanguageTechnology/comments/1k2c71l/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/LocalLLaMA/comments/1k2c68a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://github.com/huggingface/transformers/issues/16006#issuecomment-2815889377
- https://redd.it/1k2gjz2
0 comment threads