mirror of
https://github.com/pykeio/ort
synced 2026-04-25 16:34:55 +02:00
frankly, working on documentation has made me tired of typing out `execution_providers` and `ExecutionProvider` all the time
196 lines
8.8 KiB
Plaintext
196 lines
8.8 KiB
Plaintext
---
|
|
title: Migrating from v1.x to v2
|
|
description: Here's what you need to know to update your application to use the latest & greatest version of ort.
|
|
---
|
|
|
|
# Migrating from v1.x to v2
|
|
|
|
## `Environment` no more
|
|
The `Environment` struct has been removed. Only one `Environment` was allowed per process, so it didn't really make sense to have an environment as a struct.
|
|
|
|
To configure an `Environment`, you instead use the `ort::init` function, which returns the same `EnvironmentBuilder` as v1.x. Use `commit()` to then commit the environment.
|
|
|
|
```rust
|
|
ort::init()
|
|
.with_execution_providers([ep::CUDA::default().build()])
|
|
.commit();
|
|
```
|
|
|
|
`commit()` must be called before any sessions are created to take effect. Otherwise, a default environment will be created. The global environment can be updated afterward by calling `commit()` on another `EnvironmentBuilder`, however you'll need to recreate sessions after comitting the new environment in order for them to use it.
|
|
|
|
## Value specialization
|
|
The `Value` struct has been refactored into multiple strongly-typed structs: `Tensor<T>`, `Map<K, V>`, and `Sequence<T>`, and their type-erased variants: `DynTensor`, `DynMap`, and `DynSequence`.
|
|
|
|
Values returned by session inference are now `DynValue`s, which behave exactly the same as `Value` in previous versions.
|
|
|
|
Tensors created from Rust, like via the new `Tensor::new` function, can be directly and infallibly extracted into its underlying data via `extract_array` (no `try_`):
|
|
```rust
|
|
let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDA_PINNED, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
|
|
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;
|
|
|
|
let array = tensor.extract_array();
|
|
// no need to specify type or handle errors - Tensor<f32> can only extract into an f32 ArrayView
|
|
```
|
|
|
|
You can still extract tensors, maps, or sequence values normally from a `DynValue` using `try_extract_*`:
|
|
```rust
|
|
let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_array()?;
|
|
```
|
|
|
|
`DynValue` can be `downcast()`ed to the more specialized types, like `DynMap` or `Tensor<T>`:
|
|
```rust
|
|
let tensor: Tensor<f32> = value.downcast()?;
|
|
let map: DynMap = value.downcast()?;
|
|
```
|
|
|
|
Similarly, a strongly-typed value like `Tensor<T>` can be upcast back into a `DynValue` or `DynTensor`.
|
|
```rust
|
|
let dyn_tensor: DynTensor = tensor.upcast();
|
|
let dyn_value: DynValue = tensor.into_dyn();
|
|
```
|
|
|
|
### Tensor extraction directly returns an `ArrayView`
|
|
The new `extract_array` and `try_extract_array` functions return an `ndarray::ArrayView` directly, instead of putting it behind the old `ort::value::Tensor<T>` type (not to be confused with the new specialized value type). This means you don't have to `.view()` on the result:
|
|
```diff
|
|
-let generated_tokens: Tensor<f32> = outputs["output1"].try_extract()?;
|
|
-let generated_tokens = generated_tokens.view();
|
|
+let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_array()?;
|
|
```
|
|
|
|
### Full support for sequence & map values
|
|
You can now construct and extract `Sequence`/`Map` values.
|
|
|
|
### Value views
|
|
You can now obtain a view of any `Value` via the new `view()` and `view_mut()` functions, which operate similar to `ndarray`'s own view system. These views can also now be passed into session inputs.
|
|
|
|
### Mutable tensor extraction
|
|
You can extract a mutable `ArrayViewMut` or `&mut [T]` from a mutable reference to a tensor.
|
|
```rust
|
|
let (raw_shape, raw_data) = tensor.extract_tensor_mut();
|
|
```
|
|
|
|
### Device-allocated tensors
|
|
You can now create a tensor on device memory with `Tensor::new` & an allocator:
|
|
|
|
```rust
|
|
let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDA_PINNED, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
|
|
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;
|
|
```
|
|
|
|
The data will be allocated by the device specified by the allocator. You can then use the new mutable tensor extraction to modify the tensor's data.
|
|
|
|
## Session creation
|
|
`SessionBuilder::new(&environment)` has been soft-replaced with `Session::builder()`:
|
|
|
|
```diff
|
|
-// v1.x
|
|
-let session = SessionBuilder::new(&environment)?.with_model_from_file("model.onnx")?;
|
|
+// v2
|
|
+let session = Session::builder()?.commit_from_file("model.onnx")?;
|
|
```
|
|
|
|
### `SessionBuilder::with_model_*` -> `SessionBuilder::commit_*`
|
|
The final `SessionBuilder` methods have been renamed for clarity.
|
|
|
|
- `SessionBuilder::with_model_from_file` -> `SessionBuilder::commit_from_file`
|
|
- `SessionBuilder::with_model_from_memory` -> `SessionBuilder::commit_from_memory`
|
|
- `SessionBuilder::with_model_from_memory_directly` -> `SessionBuilder::commit_from_memory_directly`
|
|
- `SessionBuilder::with_model_downloaded` -> `SessionBuilder::commit_from_url`
|
|
|
|
## Session inputs
|
|
|
|
### Tensor creation
|
|
You can now create input tensors from `Array`s and `ArrayView`s. See the [tensor value documentation](/fundamentals/value#creating-values) for more information.
|
|
|
|
### `ort::inputs!` macro
|
|
|
|
v2.0 makes the transition to the new input/output system easier by providing an `inputs!` macro. This new macro allows you to specify inputs either by position as they appear in the graph (like previous versions), or by name.
|
|
|
|
The `ort::inputs!` macro will painlessly convert compatible data types (see above) into the new inputs system.
|
|
|
|
```diff
|
|
-// v1.x
|
|
-let chunk_embeddings = text_encoder.run(&[CowArray::from(text_input_chunk.into_dyn())])?;
|
|
+// v2
|
|
+let chunk_embeddings = text_encoder.run(ort::inputs![text_input_chunk])?;
|
|
```
|
|
|
|
As mentioned, you can now also specify inputs by name using a map-like syntax. This is especially useful for graphs with optional inputs.
|
|
```rust
|
|
let noise_pred = unet.run(ort::inputs![
|
|
"latents" => &latents,
|
|
"timestep" => Tensor::from_array(([1], vec![t]))?,
|
|
"encoder_hidden_states" => text_embeddings.view()
|
|
])?;
|
|
```
|
|
|
|
### Tensor creation no longer requires the session's allocator
|
|
In previous versions, `Value::from_array` took an allocator parameter. The allocator was only used because the string data contained in string tensors had to be cloned into ONNX Runtime-managed memory. However, 99% of users only ever use primitive tensors, so the extra parameter served little purpose. The new `Tensor::from_array` function now takes only an array, and the logic for converting string arrays has been moved to a new function, `DynTensor::from_string_array`.
|
|
|
|
```diff
|
|
-// v1.x
|
|
-let val = Value::from_array(session.allocator(), &array)?;
|
|
+// v2
|
|
+let val = Tensor::from_array(array)?;
|
|
```
|
|
|
|
### Separate string tensor creation
|
|
As previously mentioned, the logic for creating string tensors has been moved from `Value::from_array` to `DynTensor::from_string_array`.
|
|
|
|
To use string tensors with `ort::inputs!`, you must create a `Tensor` using `Tensor::from_string_array`.
|
|
|
|
```rust
|
|
let array = ndarray::Array::from_shape_vec((1,), vec![document]).unwrap();
|
|
let outputs = session.run(ort::inputs![
|
|
"input" => Tensor::from_string_array(array.view())?
|
|
])?;
|
|
```
|
|
|
|
## Session outputs
|
|
|
|
### New: Retrieve outputs by name
|
|
Just like how inputs can now be specified by name, you can now retrieve session outputs by name.
|
|
|
|
```rust
|
|
let l = outputs["latents"].try_extract_array::<f32>()?;
|
|
```
|
|
|
|
## Execution providers
|
|
Execution provider structs with public fields have been replaced with builder pattern structs. See the [API reference](https://docs.rs/ort/2.0.0-rc.10/ort/ep/index.html#reexports) and the [execution providers reference](/perf/execution-providers) for more information.
|
|
|
|
```diff
|
|
-// v1.x
|
|
-builder = builder.with_execution_providers(ExecutionProvider::DirectML(DirectMLExecutionProvider {
|
|
- device_id: 1
|
|
-}))?;
|
|
+// v2
|
|
+builder = builder.with_execution_providers([
|
|
+ ep::DirectML::default()
|
|
+ .with_device_id(1)
|
|
+ .build()
|
|
+])?;
|
|
```
|
|
|
|
## Updated dependencies & features
|
|
|
|
### `ndarray` 0.17
|
|
The `ndarray` dependency has been upgraded to 0.17. In order to convert tensors from `ndarray`, your application must update to `ndarray` 0.17 as well.
|
|
|
|
### `ndarray` is now optional
|
|
The dependency on `ndarray` is now optional. If you previously used `ort` with `default-features = false`, you'll need to add the `ndarray` feature to keep using `ndarray` integration.
|
|
|
|
## Model Zoo structs have been removed
|
|
ONNX pushed a new Model Zoo structure that adds hundreds of different models. This is impractical to maintain, so the built-in structs have been removed.
|
|
|
|
You can still use `Session::commit_from_url`, it just now takes a URL string instead of a struct.
|
|
|
|
## Changes to logging
|
|
Environment-level logging configuration (i.e. `EnvironmentBuilder::with_log_level`) has been removed because it could cause unnecessary confusion with our `tracing` integration.
|
|
|
|
## Renamed types
|
|
The following types have been renamed with no other changes.
|
|
- `NdArrayExtensions` -> `ArrayExtensions`
|
|
- `OrtResult`, `OrtError` -> `ort::Result`, `ort::Error`
|
|
- `TensorDataToType` -> `ExtractTensorData`
|
|
- `TensorElementDataType`, `IntoTensorElementDataType` -> `TensorElementType`, `IntoTensorElementType`
|