Class ParseBuilder
- Namespace
- Bonsai.Expressions
- Assembly
- Bonsai.Core.dll
Represents an expression builder that applies a pattern matching operation on elements of an observable sequence.
The Parse
operator is a Transform on sequences of string values. Each of the strings in the sequence will be matched against the specified Pattern using the .NET regular expression engine (Regex). If a Separator is specified, every input string will first be split using the delimiter, and each substring will then be matched against the regular expression.
The output type is automatically inferred from the structure of the parse pattern. If a separator is used, the output will be an array containing the results of matching each delimited substring against the parse pattern. If any of the values in the input sequence fails to match, an error will be raised and the sequence will be terminated.
Note
Multi-character strings can be used to specify both the pattern and the separator. This can sometimes help to parse text formats with more complex tokens. For more flexible or conditional parsing, it is also possible to chain multiple Parse
operators in a sequence, by matching against the placeholder %s
at the end of the sequence. This will match and capture any remaining text for downstream processing.
Warning
For convenience, both Pattern and Separator properties will accept the use of character escapes to represent specific white space or unicode characters. See the list of supported character escapes in .NET for more information.
Examples
The following examples illustrate using different combinations of the Pattern and Separator properties to match different kinds of formatted text data.
Pattern | Separator | Type | Description | Example |
---|---|---|---|---|
%f |
float | Match a floating-point number. | 5.0 |
|
%f;%i |
Tuple<float, int> | Match a floating-point number and an integer separated by a semicolon. | 5.1;5 |
|
%f |
, |
float[] | Match each comma-delimited substring with a floating-point number. | 3.2, 5.6, 8.9 |
\(%f,%f\) |
; |
Tuple<float, float>[] | Match each semicolon-delimited substring with a pair of floating-point numbers surrounded by parentheses. | (1, 2); (3.14, 4.5) |
%s,msg:%b |
\t |
Tuple<string, bool>[] | Match each tab-delimited substring with a pattern containing a string and boolean. | tag1,msg:true tag2,msg:false |
public class ParseBuilder : SelectBuilder, IExpressionBuilder
- Inheritance
-
ParseBuilder
- Implements
- Inherited Members
- Extension Methods
Properties
Pattern
Gets or sets the parse pattern to match, including data type format specifiers.
[TypeConverter(typeof(ParseBuilder.PatternConverter))]
public string Pattern { get; set; }
Property Value
Remarks
The parse pattern may contain zero or more placeholder characters. Each placeholder is always preceded by the character %
, and must specify one of the allowed data type format specifiers (see table below). For each placeholder in the pattern, the Parse
method of the corresponding data type will be called to convert the matched string to an equivalent instance of that type.
Note
Some placeholder conversions will account for white space characters surrounding the input, e.g. the parse pattern %i,%i
will work the same for 1,2
or 1, 2
.
Warning
All parse conversions are done using the invariant culture. Specifying culture-specific conversions is not currently supported. There is also no support for implicit numeric conversions, e.g. attempting to parse 5.0
using %i
will throw an error.
If the parse pattern is null
or empty, the operator will simply return the raw input value. If a non-empty parse pattern is provided, but no placeholder characters are specified, the result type will be of type Unit. Otherwise, the output type will be a tuple of the types corresponding to each of the placeholder characters, in order of their appearance in the parse pattern.
Pattern | Description |
---|---|
%B |
Match an unsigned 8-bit integer (byte). |
%h |
Match a signed 16-bit integer (short). |
%H |
Match an unsigned 16-bit integer (ushort). |
%i |
Match a signed 32-bit integer (int). |
%I |
Match an unsigned 32-bit integer (uint). |
%l |
Match a signed 64-bit integer (long). |
%L |
Match an unsigned 64-bit integer (ulong). |
%f |
Match a single-precision floating-point number (float). |
%d |
Match a double-precision floating-point number (double). |
%b |
Match a Boolean (true or false ) value (bool). |
%c |
Match a single character as a UTF-16 code unit (char). |
%s |
Match a text fragment using UTF-16 encoding (string). |
%t |
Match a timestamp measured relative to UTC time (DateTimeOffset). |
%T |
Match a time interval (TimeSpan). |
Warning
The parse pattern is a regular expression string and certain characters are reserved as special tokens, such as parentheses. It is possible to use these special characters by prefixing them with a backslash (e.g. \(
for a left parentheses).
Separator
Gets or sets the optional separator used to delimit elements in variable length patterns.
public string Separator { get; set; }
Property Value
Remarks
If both Separator and Pattern are specified, the separator will be used first to split the input strings. Each delimited substring will then be matched against the regular expression specified in the parse pattern. The result will be an array of the output type inferred from the structure of the parse pattern.
Methods
BuildSelector(Expression)
Returns the expression that applies a pattern matching operation on the specified input parameter to the selector result.
protected override Expression BuildSelector(Expression expression)
Parameters
expression
ExpressionThe input parameter to the selector.
Returns
- Expression
The Expression that applies a pattern matching operation on the input parameter to the selector result.