Table of Contents

Data Definition

Bonsai.Sgen addresses the problem of defining and implementing custom data types in the Bonsai programming language. Let's explore this with a simple example.

Introduction

Suppose we want to create a new record-like object type that represents a Person:

Field Name Type Description
age int The age of a person
first_name string The first name of the person
last_name string The last name of the person
dob datetime Date of birth

Since there is currently no special syntax to declare object types directly in Bonsai, we need to leverage indirect approaches to define our new record type. We start by exploring the previously available options below, along with their limitations, and finally introduce a third, more powerful, alternative.

Data Object Initializers

One powerful feature of ExpressionTransform operators is support for writing Data Object Initializers:

Person as DynamicClass

ExpressionTransform:

new(
  Item1 as Age,
  Item2 as FirstName,
  Item3 as LastName,
  Item4 as DOB
)

This approach works, but is somewhat brittle, as the Person record exists only as an anonymous type in the current workflow compiler context. This limitation prevents the creation of named references to the type, required for instance to create Subject Sources. It also requires scripting to be used anywhere we need to create new objects.

Custom Scripting Extension

A more powerful alternative is to leverage C# directly to define our type class, by using custom Scripting Extensions:

public class Person
{
    public int Age;
    public string FirstName;
    public string LastName;
    public DateTime DOB;
}

This approach is much more flexible, as it allows composing our record using arbitrary C# types. It also supports nesting and even defining custom operators and functions on the new type. However, even for simple types we will need to write additional code to allow this type to be directly created and manipulated inside a Bonsai workflow:

using Bonsai;
using System;
using System.Reactive.Linq;

public class CreatePerson : Source<Person>
{
    public int Age { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public DateTime DOB { get; set; }

    public override IObservable<Person> Generate()
    {
        return Observable.Return(new Person
        {
            Age = Age,
            FirstName = FirstName,
            LastName = LastName,
            DOB = DOB
        });
    }
}

Essentially we are augmenting the class with a source operator that creates a new instance of the record type with the parameters specified in the class properties. Because the record class is now a regular Bonsai operator, it will show up in the editor toolbox and can be placed and configured in the workflow as usual.

While this might be enough to work around the need for the single odd type in our project, it doesn't scale well to model other common requirements for domain-specific record types, such as support for type hierarchies, serialization, or polymorphism, all of which would require additional boilerplate code on top of our simple type.

As projects increase in complexity, writing such boilerplate code can quickly become cumbersome and error prone.

JSON Schema

Bonsai.Sgen provides a new, and much more flexible, solution to this problem by leveraging JSON Schema directly as a data definition language in Bonsai.

How to Use

First, define the JSON Schema for our Person data type:

person.json

{
  "title": "Person",
  "type": "object",
  "properties": {
    "Age": { "type": "integer" },
    "FirstName": { "type": "string" },
    "LastName": { "type": "string" },
    "DOB": { "type": "string", "format": "date-time" }
  }
}

Generate custom Bonsai extension code using Bonsai.Sgen:

dotnet bonsai.sgen --schema person.json --output Extensions/PersonSgen.Generated.cs

Use the generated operators directly in your Bonsai workflow:

Person as BonsaiSgen

Advantages

Although initially this form may seem less direct and more complicated than even the C# type definition, there are a number of advantages immediately falling out from using JSON Schema as our data definition language:

  1. Custom Bonsai operator code can be automatically generated from the JSON Schema.
  2. Data objects backed by JSON Schemas can be used to read and write JSON files with validation guarantees.
  3. Both JSON files and JSON Schemas are interoperable with any other language.

With Bonsai.Sgen you can focus on the specification of the data structure itself, rather than on the details of boilerplate code. Furthermore, you don't even need to write the schema by hand directly in JSON, since you can use any language supporting JSON Schemas. For example, you can easily write a full data model in Python and use those classes directly to generate a JSON Schema for Bonsai.

Saving and Loading

Bonsai.Sgen automatically generates serialization and deserialization operators:

(de)serialization

This means you immediately gain the ability to use data objects as configuration files you load into your workflow, or as data records that you save with your experiment. Because all data will be backed by a schema, all these records can be immediately accessed by Python or any other language, saving you even more time setting up data processing pipelines.