Totally Objects - Boolean Expression Parser (Version 5.5 [1.0])
Introduction
This Totally Objects product for IBM Smalltalk enables Strings containing Boolean expressions (using the operators AND, OR and NOT) to be used to see if a condition supplied in a block holds.
Installation
This Totally Objects product has been packaged as a configuration map. To import it into your library select 'Browse Configuration Maps' from the 'Tools' menu of the 'System Transcript'. In the 'Configuration Maps Browser' select 'Import...' from the 'Names' menu and select the file tobbep5-5_1-0-*.dat. You should then select the configuration map contained within this file.
To load the configuration map into your image select it in the 'Configuration Maps Browser' and select 'Load With Required Maps' from the 'Editions' menu.
Operation
You first need to create an instance of the class TobBepParser.
This object can be configured to allow certain sequences of characters to mean AND, OR and NOT, to specify valid characters that can be treated as brackets, and quote marks. Methods for configuring are:
- #notStrings and #notStrings: - to get and set a collection of valid Strings that can be treated to mean NOT. The default value is #('NOT' '~'). Strings should not contain any white space (i.e. Lf, Cr, Tab or Space) or any characters used for bracketsPairs or quoteMarks. The Strings should also be different to those specified as andStrings or orStrings. These strings are not case sensitive.
- #andStrings and #andStrings: - to get and set a collection of valid Strings that can be treated to mean AND. The default value is #('AND' '&'). Strings should not contain any white space (i.e. Lf, Cr, Tab or Space) or any characters used for bracketsPairs or quoteMarks. The Strings should also be different to those specified as orStrings or notStrings. These strings are not case sensitive.
- #orStrings and #notStrings: - to get and set a collection of valid Strings that can be treated to mean OR. The default value is #('OR' '|'). Strings should not contain any white space (i.e. Lf, Cr, Tab or Space) or any characters used for bracketsPairs or quoteMarks. The Strings should also be different to those specified as notStrings or andStrings. These strings are not case sensitive.
- #bracketPairs and #bracketPairs: - to get and set a collection of Strings containg two Characters each; open and close brackets. The default value is #('()' '{}' '[]'). These Strings should not contain any Characters used for quoteMarks or conflict with those specified as notStrings, andStrings or orStrings.
- #quoteMarks and #quoteMarks: - to get and set a collection of Characters that can hold valid quote marks. The default value is #($' $"). These Characters should not conflict with those specified as bracketPairs, notStrings, andStrings or orStrings.
Next you need to provide a String expression to parse. To do this you use the method #parse:.
Expressions must be formatted in a similar manner to the following example:
curry or sausage and egg and not (potatoes or chips) or "fish and chips"
The order of precedence of the operations are NOT, AND and OR. Notice that any elements in the expression containing white space or any of the reserved words can be put inside quote marks to keep them together as a unit. So the above expression is equivalent to:
"curry" or ("sausage" and "egg" and not ("potatoes" or "chips")) or "fish and chips"
The answer to the #parse: method is an instance of TobBepEvaluator, or, if the String contains a syntax error the exception TobBepExceptions::ExTobBepSyntaxError is signalled. This exception has two arguments; the first is a description of the error and the second is the expression.
The final stage is to evaluate the logic in the expression by providing a context to which it applies. You do this by sending the TobBepEvaluator the message #evaluate:logicBlock:. The first argument is some data and the second is a two-argument (or three-argument) block. This block is evaluated repeatedly using the data as the first argument and the element from the expression as the second (and optionally the String representation of the element before conversion - see 'Advanced Operation' below) and must answer a Boolean.
Let us look at an example:
A user wishes to find the documents in a large collection that contain certain text. The code might look something like this:
findDocumentsIn: docs thatContainTextAccordingToRule: expressionString
| parser evaluator |
parser := TobBepParser new.
evaluator := parser parse: expressionString.
^ docs select: [:doc |
evaluator
evaluate: (doc text)
logicBlock: [:docText :element |
(docText indexOfSubCollection: element startingAt: 1) ~= 0
]
]
Reduction
There are times when certain conditions hold for multiple items that we might wish to test. Testing each these against the whole expression may be unnecessary. For example, if we already have our documents grouped by subject and then we wish to find documents containing a word that happens to appear in one of the subject tiles, we do not need to bother looking for the word in each of the documents on that subject. Similarly, if we wish to find documents not containing a particular word, we can save ourselves the effort of searching through documents that are sure to contain that word. To so this we use the method #reduce:logicBlock: to produce a new TobBepEvaluator that contains simplified logic, and the method #canEvaluateWithoutFurtherData to see if the evaluation of a TobBepEvaluator will automatically answer true or false whatever data is supplied (using the method #value). To explain this, the above example is extended; this time the first argument to the method is a Dictionary. The keys contain a subject string, and the values, the documents on that subject.
findDocumentsIn: docsDict thatContainTextAccordingToRule: expressionString
| parser evaluator logicBlock answer |
parser := TobBepParser new.
evaluator := parser parse: expressionString.
logicBlock := [:docText :element |
(docText indexOfSubCollection: element startingAt: 1) ~= 0
].
answer := Set new.
docsDict keysAndValuesDo: [:subject :docs |
| newEvaluator |
newEvaluator := evaluator reduce: subject logicBlock: logicBlock.
newEvaluator canEvaluateWithoutFurtherData
ifTrue: [
] ifFalse: [
answer addAll: (docs select: [:doc |
newEvaluator
evaluate: (doc text)
logicBlock: logicBlock
])
]
].
^ answer
So for example if the argument docsDict is:
'dogs' -> #( Doc('dogs can run') Doc('dogs chase cats') )
'cats' -> #( Doc('cats climb trees') Doc('cats drink milk') )
If the expressionString is 'dogs OR horses'. Then all the elements of the 'dogs' collection will be included without testing any of them. All the items of the 'cats' collection will be tested with the expression 'dogs OR horses'.
If the expressionString is 'dogs AND cats'. Then all the elements of the 'dogs' collection will be tested with the expression 'cats'. All the items of the 'cats' collection will be tested with the expression 'dogs'.
If the expressionString is 'dogs AND NOT cats'. Then all the elements of the 'dogs' collection will be tested with the expression 'NOT cats'. All the items of the 'cats' collection will be excluded without testing them.
Validating and Converting Elements
To test that elements contain valid strings the following methods can be sent to the parser:
- #elementValidator and #elementValidator: - these get and set the validation process of elements after each is parsed. The value is either 'nil' (the default value), if all element Strings are acceptable, or a one-argument block. This block will be evaluated using the element (a String) as the argument and should answer a Boolean. If an element fails the test (i.e. the block does not answer 'true') then the exception TobBepExceptions::ExTobBepInvalidElementError is signalled with three String arguments (a description, the expression and the element).
If a element passes the validation test then it can be converted from a String to another object. This is achieved by sending the following methods to the parser.
- #elementConverter and #elementConverter: - these get and set the conversion process of elements after each is parsed. The value is either 'nil' (the default value), if the elements are left as Strings, or a one-argument block. This block will be evaluated using the element (a String) as the argument and should answer any Object representing the conversion. If an error happens during the block (i.e. 'when: ExError') then the exception TobBepExceptions::ExTobBepConvertionError is signalled with three String arguments (a description, the expression and the element).
Sometimes elements may contain comparison expressions (such as "color = blue" or "time > 13:00"). The class TobBepComparison has been provided to make the management of such expressions easier. Sending the message #asTobBepComparison to an element's String will convert it to a TobBepComparison (or answer 'nil' if it is not a suitable format). The following methods can then be used to interrogate it further:
- #left - gets the String to the left of the comparison operation
- #operation - gets the String representing the operation (one of '<' '>' '<=' '>=' '=' '<>')
- #operationAsBinaryMethod - gets a Symbol corresponding to the operation (i.e. #< #> #<= #>= #= #~=)
- #right - gets the String to the right of the comparison operation
Also, TobBepComparisons can be made directly using the class method #left:operation:right:. This can be useful if the 'left' or 'right' value need to be converted.
The following example checks the validity of a Date against an expression.
date: aDate conformsToExpression: anExpressionString
| parser evaluator |
parser := TobBepParser new
elementValidator: [:s |
s asTobBepComparison notNil
];
elementConverter: [:s |
| comparison ans |
comparison := s asTobBepComparison.
(comparison left = '' or: [comparison left asLowercase = 'date'])
ifTrue: [
| date |
comparison right asLowercase = 'today'ifTrue: [date := Date today].
comparison right asLowercase = 'tomorrow'ifTrue: [date := Date today addDays: 1].
comparison right asLowercase = 'yesterday'ifTrue: [date := Date today subtractDays: 1].
date isNilifTrue: [date := AbtDateConverter new displayToObject: comparison right].
TobBepComparison left: 'date' operation: comparison operation right: date.
] ifFalse: [
].
];
yourself.
evaluator := parser parse: anExpressionString.
^ evaluator
evaluate: aDate
logicBlock: [:testDate :comparison |
| ans |
ans := false.
comparison left = 'date'
ifTrue: [
ans := testDate
perform: comparison operationAsBinaryMethod
with: comparison right
].
comparison left asLowercase = 'day'
ifTrue: [
ans := testDate dayName asString asLowercase
perform: comparison operationAsBinaryMethod
with: comparison right asLowercase
].
ans
]
If you would like to ask any questions about this product, its operation or its commercial terms, please contact sales@totallyobjects.com.
Copyright (c) 2002 DirectDual Limited (trading as TotallyObjects).
Totally Objects - DirectDual Limited,
34 Compton Avenue,
Gidea Park, Essex,
RM2 6ES, England.
Tel: +44 1708 733295
Fax: +44 1708 783438
sales@totallyobjects.com
DirectDual Limited is an IBM Object Connection member.