Trainer Data Ergonomics#
Trainer data wiring supports reusable collation, transforms, binding, and profile composition.
trainer.fit(
train_data,
collate: :xy,
bind: :auto
)
Collation#
collate/validation_collateaccept callables, symbols, and mappingsbuilt-in schemas:
:x,:xyregistry:
register_collate(name, spec, extends:)multi-base schema composition via
extends: [:base_a, :base_b]
trainer.register_collate(:xy_map, { x: :features, y: :label })
trainer.fit(train_data, collate: :xy_map, epochs: 3)
Batch schema and auto-collate#
batch_schema(spec)split-specific forms:
batch_schema(train:, validation:)auto mode:
collate: :auto/validation_collate: :auto
trainer.batch_schema(
train: { x: :features, y: :label },
validation: { x: :input, y: :target }
)
trainer.fit(
train_data,
validation_data: validation_data,
collate: :auto,
validation_collate: :auto
)
Binding#
bindandvalidation_bindsupports
:autoargument-name inference (including common aliases)supports explicit key path mappings (for example
{x: [:payload, :x]})explicit
collatetakes precedence over bind
trainer.fit(
train_data,
bind: :auto,
validation_data: validation_data,
validation_bind: { x: [:payload, :x], y: [:payload, :y] }
)
Transforms and dynamic limits#
train_transform/validation_transformwith context-aware signatureslimit/validation_limitaccept integers or callables per epoch
trainer.fit(
train_data,
train_transform: ->(batch, epoch:) { augment(batch, epoch: epoch) },
limit: ->(epoch:) { epoch < 3 ? 100 : 500 },
validation_data: validation_data,
validation_limit: 100
)
Dataflow profiles#
register_dataflow(name, train:, validation:, extends:)use_dataflow(name, **overrides)
trainer.register_dataflow(
:xy_flow,
train: { collate: { x: 0, y: 1 }, limit: 256 },
validation: { collate: { x: :input, y: :target }, reduce: :mean }
)
report = trainer.fit_report(data, **trainer.use_dataflow(:xy_flow))
See implementation:
lib/mlx/dsl/trainer.rb