Wrapping Up Offline RL as part of AutoMLPipeline Workflow

07/26/2023, 2:45 PM — 2:55 PM UTC
Online talks and posters

Abstract:

Unlike in Online RL where agents need to interact with real environment, Offline RL works similar to a typical machine learning workflow. Given a dataset, Offline RL processes data extracting state, action, reward, and terminal columns to optimize the policy Q. By wrapping up offline RL into the AutoMLPipeline workflow, it becomes trivial to search for the optimal preprocessing elements and their combinations to improve Offline RL optimal policy using symbolic workflow manipulation.

Description:

As part of AutoMLPipeline workflow, it becomes trivial to search which preprocessing elements and their combinations provide the best policy Q by cross-validation where the dataset is split into training and testing several times to get the average accumulated discounted rewards (return) of a given policy Q. This talk will demonstrate how to setup the Offline RL pipeline to preprocess the dataset and learn the optimal policy Q and incorporate some parallel search strategy to get the optimal workflow.

Platinum sponsors

JuliaHub

Gold sponsors

ASML

Silver sponsors

Pumas AIQuEra Computing Inc.Relational AIJeffrey Sarnoff

Bronze sponsors

Jolin.ioBeacon BiosignalsMIT CSAILBoeing

Academic partners

NAWA

Local partners

Postmates

Fiscal Sponsor

NumFOCUS