Export Functions#
Graph IR schema reference:
docs/src/ruby/mlxonnx_v1.schema.json
For a step-by-step workflow guide, see Onnx/WebGPU Support.
Onnx/WebGPU Support#
Use MLX::ONNX.* as the user-facing Graph IR/ONNX API and
MLX::ONNX::WebGPUHarness for browser harness packaging/smoke checks.
Implementation is split across:
MLX::ONNX(native-backed public facade)MLX::ONNX::Native(native Graph IR/ONNX runtime implementation)MLX::ONNX::WebGPUHarness(browser harness packaging + smoke runner)
MLX Ruby supports an end-to-end browser export path:
Trace and export Graph IR hash via
MLX::ONNX.export_graph_ir(or JSON debug payload viaMLX::ONNX.export_graph_ir_json).Convert Graph IR to ONNX binary via
MLX::ONNX.graph_ir_to_onnx(or ONNX JSON debug payload viaMLX::ONNX.graph_ir_to_onnx_json).Check ONNX export readiness from traced models via
MLX::ONNX.export_onnx_compatibility_reportand inspectunsupported_ops.Export ONNX directly from trace via
MLX::ONNX.export_onnx(or JSON debug payload viaMLX::ONNX.export_onnx_json).Package browser harness assets via
MLX::ONNX::WebGPUHarness.export_onnx_webgpu_harness.Run browser smoke verification via
MLX::ONNX::WebGPUHarness.smoke_test_onnx_webgpu_harness.
Harness artifact output from
MLX::ONNX::WebGPUHarness.export_onnx_webgpu_harness:
model.onnxharness.manifest.jsoninputs.example.jsonindex.htmlharness.jsoptional external data file (for example
model.data)
The default harness provider order is ["webgpu", "wasm"]. Smoke telemetry
uses onnx_webgpu_telemetry_v1 and includes provider selection/fallback and
sample_outputs for parity assertions.
Runtime/tooling requirements:
MLX::ONNX.export_onnxandMLX::ONNX.graph_ir_to_onnxrequire path-like targets (not IO-like).Real-runtime smoke tests require Node.js + Playwright +
onnxruntime-web.bundle exec rake deps:webinstalls/checks the dependencies used by real WebGPU smoke tests.MLX::ONNX::WebGPUHarness.export_onnx_webgpu_harnessonly acceptswebgpuandwasmexecution providers.
Web demo generation is wired through bundle exec rake web:assets and emits:
GPT-2 assets under
web/assets/gpt2nanoGPT assets under
web/assets/nanogpt(exported from Hugging Face checkpoint weights)Stable Diffusion assets under
web/assets/stable_diffusion(text encoder, UNet, VAE decoder ONNX files)
Examples coverage/parity status:
Current coverage/parity gates validate full examples export and ORT runtime parity across the benchmark model set.
Current MLX::ONNX.graph_ir_to_onnx_json / MLX::ONNX.export_onnx_json scope:
Elementwise ops:
Add,Subtract,Multiply,Divide,Maximum,Minimum,Power.Unary/activation ops:
Exp,Log,Sin,Cos,Erf,Sqrt,Abs,Floor,Negative,Relu,Sigmoid,Tanh.Square(lowered asMulwith identical inputs).Softmax(when exported as a directSoftmaxnode by MLX tracing).Type/compare/select ops:
AsType(toCast),Greater,Less,Equal(withequal_nan=false), andSelect(toWhere).Full(current traced form) lowered as identity on broadcasted fill tensors.MatmulandAddMM(toGemm).Convolution(including tracedconv1d/conv2d/conv3dandconv_generalwithflip == false) lowered via layout transposes around ONNXConvwith mappedstrides/pads/dilations/groupattributes.conv_transpose1d/conv_transpose2d/conv_transpose3dtraces (exported asConvolutionwithflip == true) lowered to ONNXConvTransposewith derivedpads/output_paddingattributes.Shape ops:
Transpose(perm attribute),Reshape,Flatten,Unflatten,Squeeze,ExpandDims(toUnsqueeze), andBroadcast(toExpand) using generated int64 initializer inputs for shape/axes.Indexing ops:
Gather,GatherAxis(toGatherElements),Slice,Split, andAsStrided(current traced pattern toGather).Concatenate(toConcat) when exported with explicit axis formarguments == [axis].Pad(constant mode).ScanforCumSumlowering.ScatterAxis(fromput_along_axis) to ONNXScatterElementsfor update mode.Reductions via MLX
Reducecode mapping:0/1(all/any) are lowered via cast decompositionCast(BOOL) -> Cast(INT64) -> ReduceMin/ReduceMax -> Cast(BOOL).2 -> ReduceSum,3 -> ReduceProd,4 -> ReduceMin,5 -> ReduceMax.LogSumExp(toReduceLogSumExp) andArgReduce(toArgMin/ArgMax+ cast).Arangelowered as ONNX initializer-backed constants.graph_ir_to_onnxandexport_onnxsupport optional ONNX external-data emission for initializers viaexternal_data: trueon path-like targets, withexternal_data_size_thresholdandexternal_data_filecontrols.Constants/initializers are lowered for
bool/integer/float dtypes.complex64initializers are lowered via explicit JSON marker encoding in stubs and converted to ONNXCOMPLEX64tensors during export.For JSON graph payloads,
complex64constant leaves may be provided as marker objects{"__mlx_complex__": [real, imag]}or Ruby-style complex literal strings (for example"1.0+2.0i").
Known constraints/caveats:
Convolutionwithflip == falseand non-unitinput_dilationis unsupported.Flattenrequires known static input shape metadata.Some lowerings (for example
Gather,GatherAxis,Pad,LogSumExp) require known static shapes from Graph IR metadata.Scanlowering currently supports CumSum-compatiblereduce_typeonly.Harness input tensor building does not currently support
complex64input tensors.