Getting started with HerbConstraints

When enumerating programs using a grammar, we will encounter many redundant programs. For example, x, -(-x) and 1 * x are syntactically different programs, but they have the same semantics. Grammar constraints aim to speed up synthesis by eliminating such redundant programs and thereby reducing the size of the program space.

Setup

For this tutorial, we need to import the following modules of the Herb.jl framework:

HerbCore for the necessary data structures, like Holes and RuleNodes
HerbGrammar to define the grammar
HerbConstraints to define the constraints
HerbSearch to execute a constrained enumeration

We will also redefine the simple arithmetic grammar from the previous tutorial.

using Herb

grammar = @csgrammar begin
    Int = 1
    Int = x
    Int = - Int
    Int = Int + Int
    Int = Int * Int
end

1: Int = 1
2: Int = x
3: Int = -Int
4: Int = Int + Int
5: Int = Int * Int

Working with constraints

To show the effects of constraints, we will first enumerate all programs without constraints (up to a maximum size of 3 AST nodes).

(To make sure the grammar doesn't have any constraints, we can clear the constraints using clearconstraints!. This is not needed at this point, but could come in handy if your REPL holds a reference to a constrained version of the grammar)

begin
    clearconstraints!(grammar)
    iter_1 = BFSIterator(grammar, :Int, max_size=3)
    
    for program ∈ iter_1
        println(rulenode2expr(program, grammar))
    end
    
end

Upon inspection, we can already see some redundant programs, like 1 * 1 and -(-1). To eliminate these redundant programs, we will set up some constraints that prevent these patterns from appearing. Then we will create another iterator to enumerate all programs that satisfy the defined grammar constraints.

To make the forbidden pattern constraint general, we will use a special type of rulenode: VarNode(:A). This node matches with any subtree and can be used to forbid multiple forbidden patterns using a single constraint. For example, Forbidden(RuleNode(minus, [RuleNode(minus, [VarNode(:A)])]))) forbids:

-(-1)
-(-X)
-(-(1 + 1))
1 + -(-(1 + 1))
etc

begin
    one = 1
    x = 2
    minus = 3
    plus = 4
    times = 5
    
    addconstraint!(grammar, Forbidden(RuleNode(times, [RuleNode(one), VarNode(:A)])))        # forbid 1*A
    addconstraint!(grammar, Forbidden(RuleNode(minus, [RuleNode(minus, [VarNode(:A)])])))    # forbid -(-A)
    
    iter_2 = BFSIterator(grammar, :Int, max_size=3)
    
    for program ∈ iter_2
        println(rulenode2expr(program, grammar))
    end
end

Forbidden Constraint

The Forbidden constraint forbids any subtree in the program that matches a given template tree. Such a template tree can consist of 3 node types:

RuleNode(1). Matches exactly the given rule.
DomainRuleNode(BitVector((0, 0, 0, 1, 1)), children). Matches any rule in its bitvector domain. In this case, rule 4 and 5.
VarNode(:A). Matches any subtree. If another VarNode of the same name is used, the subtrees have to be the same.

begin
    #this constraint forbids A+A and A*A
    constraint_1 = Forbidden(DomainRuleNode(BitVector((0, 0, 0, 1, 1)), [VarNode(:A), VarNode(:A)]))
    
    # Without this constraint, we encounter 154 programs
    clearconstraints!(grammar)
    iter_3 = BFSIterator(grammar, :Int, max_size=5)
    println(length(iter_3))
    
    # With this constraint, we encounter 106 programs
    clearconstraints!(grammar)
    addconstraint!(grammar, constraint_1)
    iter_4 = BFSIterator(grammar, :Int, max_size=5)
    println(length(iter_4))
    
end

Contains Constraint

The Contains constraint enforces that a given rule appears in the program tree at least once.

In the arithmetic grammar, this constraint can be used to ensure the input symbol x is used in the program. Otherwise, the program is just a constant.

begin
    clearconstraints!(grammar)
    addconstraint!(grammar, Contains(2)) #rule 2 should be used in the program
    iter_5 = BFSIterator(grammar, :Int, max_size=3)
    
    for program ∈ iter_5
        println(rulenode2expr(program, grammar))
    end
end

Contains Subtree Constraint

Similarly to the Contains constraint, the ContainsSubtree can be used to enforce a given template tree is used in the program at least once.

begin
    clearconstraints!(grammar)
    addconstraint!(grammar, ContainsSubtree(RuleNode(times, [RuleNode(x), RuleNode(x)]))) #x*x should be in the program tree
    iter_6 = BFSIterator(grammar, :Int, max_size=4)
    
    for program ∈ iter_6
        println(rulenode2expr(program, grammar))
    end
end

Ordered Constraint

The Ordered constraint enforces an <= ordering on a provided list of variables. With this constraint, we can break symmetries based on commutativity. For example, 1+x and x+1 are semantically equivalent. By imposing an Ordered constraint, we can eliminate one of the symmetric variants.

To define an Ordered constraint, we need to provide it with a template tree including at least two differently named VarNodes. And additionally, an ordering of the variables in the tree.

In the upcoming example we will set up a template tree representing a+b and a*b. Then, we will impose an ordering a<=b on all the subtrees that match the template.

The result is that our iterator skips the redundant programs x+1 and x*1, as they are already represented by 1+x and 1*x.

begin
    clearconstraints!(grammar)
    
    template_tree = DomainRuleNode(BitVector((0, 0, 0, 1, 1)), [VarNode(:a), VarNode(:b)])
    order = [:a, :b]
    
    addconstraint!(grammar, Ordered(template_tree, order))
    iter_7 = BFSIterator(grammar, :Int, max_size=3)
    
    for program ∈ iter_7
        println(rulenode2expr(program, grammar))
    end
    
end

Forbidden Sequence Constraint

The ForbiddenSequence constraints forbids a given sequence of rule nodes in a vertical path of the tree.

An optional second argument, ignore_if, can be used to overrule the constraint in case any of the rules on the ignore_if list are present.

Below we will define the constraint ForbiddenSequence([plus, one], ignore_if=[times]). It forbids an 1 after an + unless an * disrupts the sequence.

This constraint will forbid the following programs:

x + 1
x + -1
x + -(-1)
x + (x + 1)
x * (x + 1)

But it will allow the following program (as * disrupts the sequence):

x + (x * 1)

begin
    constraint_2 = ForbiddenSequence([plus, one], ignore_if=[times])
    addconstraint!(grammar, constraint_2)
    iter_8 = BFSIterator(grammar, :Int, max_size=3)
    
    for program ∈ iter_8
        println(rulenode2expr(program, grammar))
    end
    
end

Custom Constraint

To implement a new constraint, we need to define two structs: an AbstractGrammarConstraint and an AbstractLocalConstraint.

A grammar constraint is a high-level constraint on the grammar itself and does not refer to a location in the tree. For example, the Forbidden constraint is responsible for forbidding a template tree everywhere in the tree. To divide the work of constraint propagation, the grammar constraint will post several local constraints that are responsible for propagating the constraint at each particular location.

A local constraint is a rooted version of a grammar constraint. Each local constraint holds a path field that points to a location in the tree where this constraint applies.

Suppose we want to implement a simple custom constraint that forbids a given rule twice in a row.

Each time a new AST node is added to a tree, the on_new_node function is called to notify that an unseen node has been added to the tree at path path. Our grammar constraint has the opportunity to react to this event. In this example, we will post a new local constraint at the new location using the post! function.

(Don't worry about the HerbConstraints. prefixes. Normally, constraints are defined within the HerbConstraints repository, so there is no need to specify the namespace)

begin
    """
    Forbids the consecutive application of the specified rule.
    For example, CustomConstraint(4) forbids the tree 4(1, 4(1, 1)) as it applies rule 4 twice in a row.
    """
    struct ForbidConsecutive <: AbstractGrammarConstraint
        rule::Int
    end
    
    """
    Post a local constraint on each new node that appears in the tree
    """
    function HerbConstraints.on_new_node(solver::Solver, constraint::ForbidConsecutive, path::Vector{Int})
        HerbConstraints.post!(solver, LocalForbidConsecutive(path, constraint.rule))
    end
end

Next, we will define our local constraint. This constraint is responsible for propagating the constraint at a given path. The propagate! method can use several solver functions to manipulate the tree. The following tree manipulations can be used to remove rules from the domain of a hole at a given path:

remove!(solver::Solver, path::Vector{Int}, rule_index::Int)
remove!(solver::Solver, path::Vector{Int}, rules::Vector{Int})
remove_all_but!(solver::Solver, path::Vector{Int}, new_domain::BitVector)
remove_above!(solver::Solver, path::Vector{Int}, rule_index::Int)
remove_below!(solver::Solver, path::Vector{Int}, rule_index::Int)
make_equal!(solver::Solver, node1::AbstractRuleNode, node2::AbstractRuleNode) (a high level manipulation that requires node1 and node2 to be in the tree)

In addition to tree manipulations, the following solver functions can be used to communicate new information to the solver:

set_infeasible!(solver). If a propagator detects an inconsistency, the solver should be notified and cancel any other scheduled propagators.
deactivate!(solver, constraint). If a constraint is satisfied, it should deactivate itself to prevent re-propagation.
post!(solver, constraint) A constraint is allowed to post new local constraints. This might be helpful if a constraint can be reduced to a smaller constraint.

The solver manages all constraints and the program tree we propagate on. Applying tree manipulations might cause a chain reaction of other propagators, so the shape of the tree might update as we propagate. The get the latest information about the tree, we should use the following getter functions:

get_tree(solver) returns the root node of the current (partial) program tree
isfeasible(solver) returns the a flag indicating if the solver is not violating any (other) constraints.
get_path(solver, node) returns the path at which the node is located.
get_node_at_location(solver, path) returns the node that is currently at the given path (be aware that this instance might be replaced by manipulations).
get_hole_at_location(solver, path) same as get node at location, but asserts the node is a hole (domain size >= 2).

To get information about a node, we can use the following getter functions:

isfilled(node). Returns true if the node is a RuleNode or has domain size 1.
get_rule(node). Get the rule of a filled node.
get_children(node). Get the children of a node.
node.domain[rule]. Given the node is a hole, return true if rule is in the domain.

Finally, another useful function for propagators is pattern_match(node1, node2). This function compares two trees and returns a PatternMatchResult that indicates if the nodes match, and potentially indicate which holes need to be filled to complete the match.

begin
    """
    Forbids the consecutive application of the specified rule at path `path`.
    """
    struct LocalForbidConsecutive <: AbstractLocalConstraint
        path::Vector{Int}
        rule::Int
    end
    
    """
    Propagates the constraints by preventing a consecutive application of the specified rule.
    """
    function HerbConstraints.propagate!(solver::Solver, constraint::LocalForbidConsecutive)
        node = get_node_at_location(solver, constraint.path)
        if isfilled(node)
            if get_rule(node) == constraint.rule
                #the specified rule is used, make sure the rule will not be used by any of the children
                for (i, child) ∈ enumerate(get_children(node))
                    if isfilled(child)
                        if get_rule(child) == constraint.rule
                            #the specified rule was used twice in a row, which is violating the constraint
                            set_infeasible!(solver)
                            return
                        end
                    elseif child.domain[constraint.rule]
                        child_path = push!(copy(constraint.path), i)
                        remove!(solver, child_path, constraint.rule) # remove the rule from the domain of the child
                    end
                end
            end
        elseif node.domain[constraint.rule]
            #our node is a hole with the specified rule in its domain
            #we will now check if any of the children already uses the specified rule
            softfail = false
            for (i, child) ∈ enumerate(get_children(node))
                if isfilled(child)
                    if get_rule(child) == constraint.rule
                        #the child holds the specified rule, so the parent cannot have this rule
                        remove!(solver, constraint.path, constraint.rule)
                    end
                elseif child.domain[constraint.rule]
                    #the child is a hole and contains the specified node. since there are 2 holes involved, we will softfail.
                    softfail = true
                end
            end
            if softfail
                #we cannot deactivate the constraint, because it needs to be repropagated
                return
            end
        end
    
        #the constraint is satisfied and can be deactivated
        HerbConstraints.deactivate!(solver, constraint)
    end
end

Posting a local constraint will trigger the initial propagation. To re-propagate, the constraint needs to be rescheduled for propagation.

Whenever the tree is manipulated, we will make a shouldschedule check to see if our constraint needs to be rescheduled for propagation based on the manipulation.

In our case, we want to repropagate if either:

a tree manipulation occurred at the constraint.path
a tree manipulation occurred at the child of the constraint.path


"""
Gets called whenever an tree manipulation occurs at the given `path`.
Returns true iff the `constraint` should be rescheduled for propagation.
"""
function HerbConstraints.shouldschedule(solver::Solver, constraint::LocalForbidConsecutive, path::Vector{Int})::Bool
    return (path == constraint.path) || (path == constraint.path[1:end-1])
end

With all the components implemented, we can do a constrained enumeration using our new ForbidConsecutive constraint.

begin
    clearconstraints!(grammar)
    
    addconstraint!(grammar, ForbidConsecutive(minus))
    addconstraint!(grammar, ForbidConsecutive(plus))
    addconstraint!(grammar, ForbidConsecutive(times))
    
    iter = BFSIterator(grammar, :Int, max_size=6)
    
    for program ∈ iter
        println(rulenode2expr(program, grammar))
    end
end