Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Post History
I converted the PyTorch model Helsinki-NLP/opus-mt-fr-en (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script: import os from optimum.onnxruntime im...
#7: Post edited
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes**
- **Total size** = 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes**. That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
- ---
- Crossposts:
- - https://stackoverflow.com/q/79580250/395857
- - https://old.reddit.com/r/mlops/comments/1k2c6tl/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pytorch/comments/1k2c78e/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnmachinelearning/comments/1k2c6nw/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/huggingface/comments/1k2c8p2/how_can_i_export_an_encoderdecoder_huggingface/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnpython/comments/1k2c89a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pythonhelp/comments/1k2c7wz/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/MachineLearning/comments/1k2c6ii/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LanguageTechnology/comments/1k2c71l/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LocalLLaMA/comments/1k2c68a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://github.com/huggingface/transformers/issues/16006#issuecomment-2815889377
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes**
- **Total size** = 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes**. That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
- ---
- Crossposts:
- - https://stackoverflow.com/q/79580250/395857
- - https://old.reddit.com/r/mlops/comments/1k2c6tl/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pytorch/comments/1k2c78e/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnmachinelearning/comments/1k2c6nw/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/huggingface/comments/1k2c8p2/how_can_i_export_an_encoderdecoder_huggingface/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnpython/comments/1k2c89a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pythonhelp/comments/1k2c7wz/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/MachineLearning/comments/1k2c6ii/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LanguageTechnology/comments/1k2c71l/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LocalLLaMA/comments/1k2c68a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://github.com/huggingface/transformers/issues/16006#issuecomment-2815889377
- - https://redd.it/1k2gjz2
#6: Post edited
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes**
- **Total size** = 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes**. That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
- ---
- Crossposts:
- - https://stackoverflow.com/q/79580250/395857
- - https://old.reddit.com/r/mlops/comments/1k2c6tl/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pytorch/comments/1k2c78e/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnmachinelearning/comments/1k2c6nw/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/huggingface/comments/1k2c8p2/how_can_i_export_an_encoderdecoder_huggingface/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnpython/comments/1k2c89a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pythonhelp/comments/1k2c7wz/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/MachineLearning/comments/1k2c6ii/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LanguageTechnology/comments/1k2c71l/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- https://old.reddit.com/r/LocalLLaMA/comments/1k2c68a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes**
- **Total size** = 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes**. That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
- ---
- Crossposts:
- - https://stackoverflow.com/q/79580250/395857
- - https://old.reddit.com/r/mlops/comments/1k2c6tl/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pytorch/comments/1k2c78e/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnmachinelearning/comments/1k2c6nw/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/huggingface/comments/1k2c8p2/how_can_i_export_an_encoderdecoder_huggingface/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnpython/comments/1k2c89a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pythonhelp/comments/1k2c7wz/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/MachineLearning/comments/1k2c6ii/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LanguageTechnology/comments/1k2c71l/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LocalLLaMA/comments/1k2c68a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://github.com/huggingface/transformers/issues/16006#issuecomment-2815889377
#5: Post edited
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
* `encoder_model.onnx`: **198,711,098 bytes** **Total size** =346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes** That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes**
- **Total size** = 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes**. That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
- ---
- Crossposts:
- - https://stackoverflow.com/q/79580250/395857
- - https://old.reddit.com/r/mlops/comments/1k2c6tl/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pytorch/comments/1k2c78e/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnmachinelearning/comments/1k2c6nw/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/huggingface/comments/1k2c8p2/how_can_i_export_an_encoderdecoder_huggingface/?ref=share&ref_source=link
- - https://old.reddit.com/r/learnpython/comments/1k2c89a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/pythonhelp/comments/1k2c7wz/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/MachineLearning/comments/1k2c6ii/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LanguageTechnology/comments/1k2c71l/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
- - https://old.reddit.com/r/LocalLLaMA/comments/1k2c68a/how_can_i_export_an_encoderdecoder_pytorch_model/?ref=share&ref_source=link
#4: Post edited
I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX. I used this script:- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes** **Total size** =
- 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes** That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX using this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes** **Total size** =
- 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes** That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
#3: Post edited
How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- How can I export an encoder-decoder PyTorch model into a single ONNX file?
I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace) to ONNX. I use this script:- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
Having several ONNX files is an issue as the PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicate that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).----Crossposts:- https://qr.ae/pAYRNW- https://stackoverflow.com/q/79580250/395857
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace), which is an encoder-decoder model for machine translation, to ONNX. I used this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue because:
- 1. The PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicates that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- 2. Having both a `decoder_model.onnx` and `decoder_with_past_model.onnx` duplicates many parameters.
- The total size of the three ONNX files is:
- * `decoder_model.onnx`: **346,250,804 bytes**
- * `decoder_with_past_model.onnx`: **333,594,274 bytes**
- * `encoder_model.onnx`: **198,711,098 bytes** **Total size** =
- 346,250,804 + 333,594,274 + 198,711,098 = **878,556,176 bytes** That’s approximately **837.57 MB**, why is almost 3 times larger than the original PyTorch model (300 MB).
#2: Post edited
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace) to ONNX. I use this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
Having several ONNX files is an issue as the PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicate that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace) to ONNX. I use this script:
- ```python
- import os
- from optimum.onnxruntime import ORTModelForSeq2SeqLM
- from transformers import AutoTokenizer, AutoConfig
- hf_model_id = "Helsinki-NLP/opus-mt-fr-en"
- onnx_save_directory = "./onnx_model_fr_en"
- os.makedirs(onnx_save_directory, exist_ok=True)
- print(f"Starting conversion for model: {hf_model_id}")
- print(f"ONNX model will be saved to: {onnx_save_directory}")
- print("Loading tokenizer and config...")
- tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
- config = AutoConfig.from_pretrained(hf_model_id)
- model = ORTModelForSeq2SeqLM.from_pretrained(
- hf_model_id,
- export=True,
- from_transformers=True,
- # Pass the loaded config explicitly during export
- config=config
- )
- print("Saving ONNX model components, tokenizer and configuration...")
- model.save_pretrained(onnx_save_directory)
- tokenizer.save_pretrained(onnx_save_directory)
- print("-" * 30)
- print(f"Successfully converted '{hf_model_id}' to ONNX.")
- print(f"Files saved in: {onnx_save_directory}")
- if os.path.exists(onnx_save_directory):
- print("Generated files:", os.listdir(onnx_save_directory))
- else:
- print("Warning: Save directory not found after saving.")
- print("-" * 30)
- print("Loading ONNX model and tokenizer for testing...")
- onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory)
- onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory)
- french_text= "je regarde la tele"
- print(f"Input (French): {french_text}")
- inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors
- print("Generating translation using the ONNX model...")
- generated_ids = onnx_model.generate(**inputs)
- english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- print(f"Output (English): {english_translation}")
- print("--- Test complete ---")
- ```
- The output folder containing the ONNX files is:
- ```
- franck@server:~/tests/onnx_model_fr_en$ ls -la
- total 860968
- drwxr-xr-x 2 franck users 4096 Apr 16 17:29 .
- drwxr-xr-x 5 franck users 4096 Apr 17 23:54 ..
- -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json
- -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx
- -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx
- -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx
- -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json
- -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm
- -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json
- -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm
- -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json
- -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json
- ```
- How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
- Having several ONNX files is an issue as the PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicate that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).
- ----
- Crossposts:
- - https://qr.ae/pAYRNW
- - https://stackoverflow.com/q/79580250/395857
#1: Initial revision
How can I export an opus-mt-fr-en PyTorch model into a single ONNX file?
I converted the PyTorch model [`Helsinki-NLP/opus-mt-fr-en`](https://huggingface.co/Helsinki-NLP/opus-mt-fr-en) (HuggingFace) to ONNX. I use this script: ```python import os from optimum.onnxruntime import ORTModelForSeq2SeqLM from transformers import AutoTokenizer, AutoConfig hf_model_id = "Helsinki-NLP/opus-mt-fr-en" onnx_save_directory = "./onnx_model_fr_en" os.makedirs(onnx_save_directory, exist_ok=True) print(f"Starting conversion for model: {hf_model_id}") print(f"ONNX model will be saved to: {onnx_save_directory}") print("Loading tokenizer and config...") tokenizer = AutoTokenizer.from_pretrained(hf_model_id) config = AutoConfig.from_pretrained(hf_model_id) model = ORTModelForSeq2SeqLM.from_pretrained( hf_model_id, export=True, from_transformers=True, # Pass the loaded config explicitly during export config=config ) print("Saving ONNX model components, tokenizer and configuration...") model.save_pretrained(onnx_save_directory) tokenizer.save_pretrained(onnx_save_directory) print("-" * 30) print(f"Successfully converted '{hf_model_id}' to ONNX.") print(f"Files saved in: {onnx_save_directory}") if os.path.exists(onnx_save_directory): print("Generated files:", os.listdir(onnx_save_directory)) else: print("Warning: Save directory not found after saving.") print("-" * 30) print("Loading ONNX model and tokenizer for testing...") onnx_tokenizer = AutoTokenizer.from_pretrained(onnx_save_directory) onnx_model = ORTModelForSeq2SeqLM.from_pretrained(onnx_save_directory) french_text= "je regarde la tele" print(f"Input (French): {french_text}") inputs = onnx_tokenizer(french_text, return_tensors="pt") # Use PyTorch tensors print("Generating translation using the ONNX model...") generated_ids = onnx_model.generate(**inputs) english_translation = onnx_tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(f"Output (English): {english_translation}") print("--- Test complete ---") ``` The output folder containing the ONNX files is: ``` franck@server:~/tests/onnx_model_fr_en$ ls -la total 860968 drwxr-xr-x 2 franck users 4096 Apr 16 17:29 . drwxr-xr-x 5 franck users 4096 Apr 17 23:54 .. -rw-r--r-- 1 franck users 1360 Apr 17 04:38 config.json -rw-r--r-- 1 franck users 346250804 Apr 17 04:38 decoder_model.onnx -rw-r--r-- 1 franck users 333594274 Apr 17 04:38 decoder_with_past_model.onnx -rw-r--r-- 1 franck users 198711098 Apr 17 04:38 encoder_model.onnx -rw-r--r-- 1 franck users 288 Apr 17 04:38 generation_config.json -rw-r--r-- 1 franck users 802397 Apr 17 04:38 source.spm -rw-r--r-- 1 franck users 74 Apr 17 04:38 special_tokens_map.json -rw-r--r-- 1 franck users 778395 Apr 17 04:38 target.spm -rw-r--r-- 1 franck users 847 Apr 17 04:38 tokenizer_config.json -rw-r--r-- 1 franck users 1458196 Apr 17 04:38 vocab.json ``` How can I export an opus-mt-fr-en PyTorch model into a single ONNX file? Having several ONNX files is an issue as the PyTorch model shares the embedding layer with both the encoder and the decoder, and subsequently the export script above duplicate that layer to both the `encoder_model.onnx` and `decoder_model.onnx`, which is an issue as the embedding layer is large (represents ~40% of the PyTorch model size).