Table of Contents

Class ParseBuilder

Namespace
Bonsai.Expressions
Assembly
Bonsai.Core.dll

Represents an expression builder that applies a pattern matching operation on elements of an observable sequence.

The Parse operator is a Transform on sequences of string values. Each of the strings in the sequence will be matched against the specified Pattern using the .NET regular expression engine (Regex). If a Separator is specified, every input string will first be split using the delimiter, and each substring will then be matched against the regular expression.

The output type is automatically inferred from the structure of the parse pattern. If a separator is used, the output will be an array containing the results of matching each delimited substring against the parse pattern. If any of the values in the input sequence fails to match, an error will be raised and the sequence will be terminated.

Note

Multi-character strings can be used to specify both the pattern and the separator. This can sometimes help to parse text formats with more complex tokens. For more flexible or conditional parsing, it is also possible to chain multiple Parse operators in a sequence, by matching against the placeholder %s at the end of the sequence. This will match and capture any remaining text for downstream processing.

Warning

For convenience, both Pattern and Separator properties will accept the use of character escapes to represent specific white space or unicode characters. See the list of supported character escapes in .NET for more information.

Examples

The following examples illustrate using different combinations of the Pattern and Separator properties to match different kinds of formatted text data.

Pattern Separator Type Description Example
%f float Match a floating-point number. 5.0
%f;%i Tuple<float, int> Match a floating-point number and an integer separated by a semicolon. 5.1;5
%f , float[] Match each comma-delimited substring with a floating-point number. 3.2, 5.6, 8.9
\(%f,%f\) ; Tuple<float, float>[] Match each semicolon-delimited substring with a pair of floating-point numbers surrounded by parentheses. (1, 2); (3.14, 4.5)
%s,msg:%b \t Tuple<string, bool>[] Match each tab-delimited substring with a pattern containing a string and boolean. tag1,msg:true tag2,msg:false
public class ParseBuilder : SelectBuilder, IExpressionBuilder
Inheritance
ParseBuilder
Implements
Inherited Members
Extension Methods

Properties

Pattern

Gets or sets the parse pattern to match, including data type format specifiers.

[TypeConverter(typeof(ParseBuilder.PatternConverter))]
public string Pattern { get; set; }

Property Value

string

Remarks

The parse pattern may contain zero or more placeholder characters. Each placeholder is always preceded by the character %, and must specify one of the allowed data type format specifiers (see table below). For each placeholder in the pattern, the Parse method of the corresponding data type will be called to convert the matched string to an equivalent instance of that type.

Note

Some placeholder conversions will account for white space characters surrounding the input, e.g. the parse pattern %i,%i will work the same for 1,2 or 1, 2.

Warning

All parse conversions are done using the invariant culture. Specifying culture-specific conversions is not currently supported. There is also no support for implicit numeric conversions, e.g. attempting to parse 5.0 using %i will throw an error.

If the parse pattern is null or empty, the operator will simply return the raw input value. If a non-empty parse pattern is provided, but no placeholder characters are specified, the result type will be of type Unit. Otherwise, the output type will be a tuple of the types corresponding to each of the placeholder characters, in order of their appearance in the parse pattern.

Pattern Description
%B Match an unsigned 8-bit integer (byte).
%h Match a signed 16-bit integer (short).
%H Match an unsigned 16-bit integer (ushort).
%i Match a signed 32-bit integer (int).
%I Match an unsigned 32-bit integer (uint).
%l Match a signed 64-bit integer (long).
%L Match an unsigned 64-bit integer (ulong).
%f Match a single-precision floating-point number (float).
%d Match a double-precision floating-point number (double).
%b Match a Boolean (true or false) value (bool).
%c Match a single character as a UTF-16 code unit (char).
%s Match a text fragment using UTF-16 encoding (string).
%t Match a timestamp measured relative to UTC time (DateTimeOffset).
%T Match a time interval (TimeSpan).
Warning

The parse pattern is a regular expression string and certain characters are reserved as special tokens, such as parentheses. It is possible to use these special characters by prefixing them with a backslash (e.g. \( for a left parentheses).

Separator

Gets or sets the optional separator used to delimit elements in variable length patterns.

public string Separator { get; set; }

Property Value

string

Remarks

If both Separator and Pattern are specified, the separator will be used first to split the input strings. Each delimited substring will then be matched against the regular expression specified in the parse pattern. The result will be an array of the output type inferred from the structure of the parse pattern.

Methods

BuildSelector(Expression)

Returns the expression that applies a pattern matching operation on the specified input parameter to the selector result.

protected override Expression BuildSelector(Expression expression)

Parameters

expression Expression

The input parameter to the selector.

Returns

Expression

The Expression that applies a pattern matching operation on the input parameter to the selector result.