docs: nextra v4

2026-04-26 00:44:56 +02:00 · 2025-04-04 02:19:36 -05:00
parent 79122668c4
commit 773d836a2f
35 changed files with 1306 additions and 1925 deletions
--- a/docs/content/troubleshooting/performance.mdx
+++ b/docs/content/troubleshooting/performance.mdx
@@ -0,0 +1,18 @@
+---
+title: 'Troubleshooting: Performance'
+---
+
+# Troubleshooting: Performance
+
+## Execution providers don't seem to register
+`ort` is designed to fail gracefully when an execution provider is not available or fails to register. To debug errors raised by EPs, [set up logging for `ort`](/troubleshooting/logging).
+
+You can also detect EP regsitration failures programmatically. See [Execution providers: Fallback behavior](/perf/execution-providers#fallback-behavior) for more info.
+
+## Inference is slower than expected
+There are a few things you could try to improve performance:
+- **Run `onnxsim` on the model.** Direct graph exports from some frameworks can leave a lot of junk nodes in the graph, which could hinder performance. [`onnxsim`](https://github.com/daquexian/onnx-simplifier) is a neat tool that can be used to simplify the ONNX graph and potentially improve performance.
+- **Try different [execution providers](/perf/execution-providers)** for your hardware.
+- **Use the [transformer optimization tool](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/python/tools/transformers).** This is another neat tool that converts certain transformer-based models to far more optimized graphs.
+- **Use [I/O binding](/perf/io-binding).** This can reduce latency caused by copying the session inputs/outputs to/from devices.
+- **[Quantize your model.](https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html)** You could try quantizing your model to 8-bit precision. This comes with a small accuracy loss, but can sometimes provide a large performance boost. If the accuracy loss is too high, you could also use [float16/mixed precision](https://onnxruntime.ai/docs/performance/model-optimizations/float16.html).