Rewrites#
This section of the documentation references all the rewrites that can be applied during the compilation of an Aesara graph
Tensor rewrites#
These rewrites are implemented in the module tensor.rewriting.basic.
Tensor optimizations addressing the ops in basic.py.
- aesara.tensor.rewriting.basic.broadcast_like(value, template, fgraph, dtype=None)[source]#
Return a Variable with the same shape and dtype as the template, filled by broadcasting value through it.
valuewill be cast as necessary.
- aesara.tensor.rewriting.basic.encompasses_broadcastable(b1, b2)[source]#
- Parameters:
b1 – The broadcastable attribute of a tensor type.
b2 – The broadcastable attribute of a tensor type.
- Returns:
True if the broadcastable patterns b1 and b2 are such that b2 is broadcasted to b1’s shape and not the opposite.
- Return type:
bool
Indexing#
- aesara.tensor.rewriting.subtensor.get_advsubtensor_axis(indices)[source]#
Determine the axis at which an array index is applied.
This only works for
take-like indices: e.g.x[:, :, idx, ...]. For the above example,get_advsubtensor_axiswould return2. If it encounters anything other than a set ofindicescontaining full slices and an array/tensor index, it will returnNone.
- aesara.tensor.rewriting.subtensor.is_full_slice(x)[source]#
Determine if
xis aslice(None)or a symbolic equivalent.
- aesara.tensor.rewriting.subtensor.merge_two_slices(fgraph, slice1, len1, slice2, len2)[source]#
This function merges two slices into a single slice. The code works on the assumption that:
slice1 is actually a slice and not an index, while slice2 can be just an index.
the two slices have been applied consecutively on the same tensor
The output slice is not in canonical form, but actually just a slice that can be applied to a tensor to produce the same output as applying the two consecutive slices.
len1is the length of the tensor before applying the first slice, whilelen2is the length after applying the first slice.
- aesara.tensor.rewriting.subtensor.transform_take(a, indices, axis)[source]#
Transform
arr[:,:,:,indices,...]-like operations into single-dimensional, vector index operations.This effectively converts certain
AdvancedSubtensorOps into a combination ofAdvancedSubtensor1,Dimshuffle, andReshapeOps, which can be more efficient.- Parameters:
a (TensorVariable) – The source array.
indices (TensorVariable, ndarray, list, tuple) – The indices of the values to extract.
axis (int) – The axis over which to select values. By default, the flattened input array is used.
Shape#
- class aesara.tensor.rewriting.shape.ShapeFeature[source]#
A
Featurethat tracks shape information in a graph.This
Featureaids in the replacement of allShapes andSubtensors ofShapes withShape_iandMakeVectorOps.This
Featureand its associated rewrites have several goals:to “lift”
Shapes to as close to the inputs as possible,to infer the shape of every node in the graph in terms of the input shapes, and
remove fill
Ops (e.g.Second) from the graph.
Lifting shapes as close to the inputs as possible is important for canonicalization because it is very bad form to have to compute something just to know how big it will be. Firstly, it is a waste of time to compute such outputs. But it is important to get rid of these outputs as early as possible in the compilation process because the extra computations make it appear as if many internal graph nodes have multiple clients. Many rewrites refuse to work on nodes with multiple clients.
Lifting is done by using an
<Op>.infer_shapefunction if one is present, or else using a conservative default. An Op that supports shape-lifting should define a infer_shape(self, fgraph, node, input_shapes) function. The argument input_shapes is a tuple of tuples… there is an interior tuple for each input to the node. The tuple has as many elements as dimensions. The element in position i of tuple j represents the i’th shape component of the j’th input. The function should return a tuple of tuples. One output tuple for each node.output. Again, the i’th element of the j’th output tuple represents the output[j].shape[i] of the function. If an output is not a TensorType, then None should be returned instead of a tuple for that output.For example the infer_shape for a matrix-matrix product would accept input_shapes=((x0,x1), (y0,y1)) and return ((x0, y1),).
Inferring the shape of internal nodes in the graph is important for doing size-driven rewrites. If we know how big various intermediate results will be, we can estimate the cost of many Ops accurately, and generate c-code that is specific [e.g. unrolled] to particular sizes.
In cases where you cannot figure out the shape, raise a ShapeError.
Notes
Right now there is only the ConvOp that could really take advantage of this shape inference, but it is worth it even just for the ConvOp. All that’s necessary to do shape inference is 1) to mark shared inputs as having a particular shape, either via a .tag or some similar hacking; and 2) to add an optional In() argument to promise that inputs will have a certain shape (or even to have certain shapes in certain dimensions).
We can’t automatically infer the shape of shared variables as they can change of shape during the execution by default.
To use this shape information in rewrites, use the
shape_ofdictionary.For example:
try: shape_of = fgraph.shape_feature.shape_of except AttributeError: # This can happen when the mode doesn't include the ShapeFeature. return shape_of_output_zero = shape_of[node.output[0]]
The
shape_of_output_zerosymbol will contain a tuple, whose elements are either integers or symbolic integers.TODO: check to see if the symbols are necessarily non-constant… or are integer literals sometimes Aesara constants?? That would be confusing.
- clone()[source]#
Create a clone that can be attached to a new
FunctionGraph.This default implementation returns
self, which carries the assumption that theFeatureis essentially stateless. If a subclass has state of its own that is in any way relative to a givenFunctionGraph, this method should be overridden with an implementation that actually creates a fresh copy.
- default_infer_shape(fgraph, node, i_shapes)[source]#
Return a list of shape tuple or None for the outputs of node.
This function is used for Ops that don’t implement infer_shape. Ops that do implement infer_shape should use the i_shapes parameter, but this default implementation ignores it.
- get_shape(var, idx)[source]#
Rewrites can call this to get a
Shape_i.It is better to call this then use directly
shape_of[var][idx]as this method should updateshape_ofif needed.TODO: Up to now, we don’t update it in all cases. Update in all cases.
- on_attach(fgraph)[source]#
Called by
FunctionGraph.attach_feature, the method that attaches the feature to theFunctionGraph. Since this is called after theFunctionGraphis initially populated, this is where you should run checks on the initial contents of theFunctionGraph.The on_attach method may raise the
AlreadyThereexception to cancel the attach operation if it detects that another Feature instance implementing the same functionality is already attached to theFunctionGraph.The feature has great freedom in what it can do with the
fgraph: it may, for example, add methods to it dynamically.
- on_change_input(fgraph, node, i, r, new_r, reason)[source]#
Called whenever
node.inputs[i]is changed fromvartonew_var. At the moment the callback is done, the change has already taken place.If you raise an exception in this function, the state of the graph might be broken for all intents and purposes.
- on_detach(fgraph)[source]#
Called by
FunctionGraph.remove_feature. Should remove any dynamically-added functionality that it installed into the fgraph.
- on_import(fgraph, node, reason)[source]#
Called whenever a node is imported into
fgraph, which is just before the node is actually connected to the graph.Note: this is not called when the graph is created. If you want to detect the first nodes to be implemented to the graph, you should do this by implementing
on_attach.
- same_shape(x: Variable, y: Variable, dim_x: Optional[int] = None, dim_y: Optional[int] = None) bool[source]#
Return
Trueifxandyhave the same shape.- Parameters:
x – The
Variablefor which its shape is to be compared withy’s shape.y – The
Variablefor which its shape is to be compared withx’s shape.dim_x – If non
None, compare only the dimension ofxequal todim_x.dim_y – If non
None, compare only the dimension ofyequal todim_y.
- set_shape(r, s, override=False)[source]#
Assign the shape
sto previously un-shaped variabler.- Parameters:
r (a variable) –
s (None or a tuple of symbolic integers) –
override (If False, it mean r is a new object in the fgraph.) – If True, it mean r is already in the fgraph and we want to override its shape.
- class aesara.tensor.rewriting.shape.ShapeOptimizer[source]#
Rewriter that adds
ShapeFeatureas a feature.- apply(fgraph)[source]#
Apply the rewriter to a
FunctionGraph.It may use all the methods defined by the
FunctionGraph. If theGraphRewriterneeds to use a certain tool, such as anInstanceFinder, it can do so in itsadd_requirementsmethod.
- class aesara.tensor.rewriting.shape.UnShapeOptimizer[source]#
Rewriter that removes
ShapeFeatureas a feature.- apply(fgraph)[source]#
Apply the rewriter to a
FunctionGraph.It may use all the methods defined by the
FunctionGraph. If theGraphRewriterneeds to use a certain tool, such as anInstanceFinder, it can do so in itsadd_requirementsmethod.
Mathematical operations#
Rewrites for the Ops in aesara.tensor.math.
- class aesara.tensor.rewriting.math.AlgebraicCanonizer(main, inverse_fn, reciprocal_fn, calculate, use_reciprocal=True)[source]#
A
Rewriterthat rewrites algebraic expressions.The variable is a
node_rewriter. It is best used with aWalkingGraphRewriterin in-to-out order.Usage:
AlgebraicCanonizer(main, inverse, reciprocal, calculate)- Parameters:
main – A suitable
Opclass that is commutative, associative and takes one to an arbitrary number of inputs, e.g. add or mulinverse – An
Opclass such thatinverse(main(x, y), y) == x(e.g.subortrue_divide).reciprocal – A function such that
main(x, reciprocal(y)) == inverse(x, y)(e.g.negorreciprocal).calculate – Function that takes a list of
numpy.ndarrayinstances for the numerator, another list for the denumerator, and calculatesinverse(main(\*num), main(\*denum)). It takes a keyword argument,aslist. IfTrue, the value should be returned as a list of one element, unless the value is such thatvalue = main(). In that case, the return value should be an empty list.
Examples
>>> import aesara.tensor as at >>> from aesara.tensor.rewriting.math import AlgebraicCanonizer >>> add_canonizer = AlgebraicCanonizer(add, sub, neg, \\ ... lambda n, d: sum(n) - sum(d)) >>> mul_canonizer = AlgebraicCanonizer(mul, true_divide, inv, \\ ... lambda n, d: prod(n) / prod(d))
Examples of rewrites
mul_canonizercan perform:x / x -> 1(x * y) / x -> yx / y / x -> 1 / yx / y / z -> x / (y * z)x / (y / z) -> (x * z) / y(a / b) * (b / c) * (c / d) -> a / d(2.0 * x) / (4.0 * y) -> (0.5 * x) / y2 * x / 2 -> xx * y * z -> Elemwise(mul){x,y,z} #only one pass over the memory.!-> Elemwise(mul){x,Elemwise(mul){y,z}}- get_num_denum(inp)[source]#
This extract two lists,
numanddenum, such that the input is:self.inverse(self.main(\*num), self.main(\*denum)). It returns the two lists in a(num, denum)pair.For example, for main, inverse and
reciprocal = \*, / and inv(),input -> returned value (num, denum)x*y -> ([x, y], [])inv(x) -> ([], [x])inv(x) * inv(y) -> ([], [x, y])x*y/z -> ([x, y], [z])log(x) / y * (z + x) / y -> ([log(x), z + x], [y, y])(((a / b) * c) / d) -> ([a, c], [b, d])a / (b / c) -> ([a, c], [b])log(x) -> ([log(x)], [])x**y -> ([x**y], [])x * y * z -> ([x, y, z], [])
- merge_num_denum(num, denum)[source]#
Utility function which takes two lists, num and denum, and returns something which is equivalent to inverse(main(*num), main(*denum)), but depends on the length of num and the length of denum (in order to minimize the number of operations).
Let n = len(num) and d = len(denum):
n=0, d=0: neutral element (given by self.calculate([], []))(for example, this would be 0 if main is additionand 1 if main is multiplication)n=1, d=0: num[0]n=0, d=1: reciprocal(denum[0])n=1, d=1: inverse(num[0], denum[0])n=0, d>1: reciprocal(main(*denum))n>1, d=0: main(*num)n=1, d>1: inverse(num[0], main(*denum))n>1, d=1: inverse(main(*num), denum[0])n>1, d>1: inverse(main(*num), main(*denum))Given the values of n and d to which they are associated, all of the above are equivalent to: inverse(main(*num), main(*denum))
- simplify(num, denum, out_type)[source]#
Shorthand for:
self.simplify_constants(*self.simplify_factors(num, denum))
- simplify_constants(orig_num, orig_denum, out_type=None)[source]#
Find all constants and put them together into a single constant.
Finds all constants in orig_num and orig_denum (using get_constant) and puts them together into a single constant. The constant is inserted as the first element of the numerator. If the constant is the neutral element, it is removed from the numerator.
Examples
Let main be multiplication:
[2, 3, x], [] -> [6, x], [][x, y, 2], [4, z] -> [0.5, x, y], [z][x, 2, y], [z, 2] -> [x, y], [z]
- simplify_factors(num, denum)[source]#
For any Variable r which is both in num and denum, removes it from both lists. Modifies the lists inplace. Returns the modified lists. For example:
[x], [x] -> [], [][x, y], [x] -> [y], [][a, b], [c, d] -> [a, b], [c, d]
- tracks()[source]#
Return the list of
Opclasses to which this rewrite applies.Returns
Nonewhen the rewrite applies to all nodes.
- transform(fgraph, node)[source]#
Rewrite the sub-graph given by
node.Subclasses should implement this function so that it returns one of the following:
Falseto indicate that this rewrite cannot be applied tonodeA list of
Variables to use in place of thenode’s current outputsA
dictmapping oldVariables toVariables, or the key
"remove"mapping to a list ofVariables to be removed.- Parameters:
fgraph – A
FunctionGraphcontainingnode.node – An
Applynode to be rewritten.
- aesara.tensor.rewriting.math.attempt_distribution(factor, num, denum, out_type)[source]#
Try to insert each
numand eachdenumin the factor?- Returns:
If there are changes,
new_numandnew_denumcontain all the numerators and denominators that could not be distributed in the factor- Return type:
changes?, new_factor, new_num, new_denum
- aesara.tensor.rewriting.math.check_for_x_over_absX(numerators, denominators)[source]#
Convert x/abs(x) into sign(x).
- aesara.tensor.rewriting.math.compute_mul(tree)[source]#
Compute the Variable that is the output of a multiplication tree.
This is the inverse of the operation performed by
parse_mul_tree, i.e. compute_mul(parse_mul_tree(tree)) == tree.- Parameters:
tree – A multiplication tree (as output by
parse_mul_tree).- Returns:
A Variable that computes the multiplication represented by the tree.
- Return type:
object
- aesara.tensor.rewriting.math.get_constant(v)[source]#
- Returns:
A numeric constant if v is a Constant or, well, a numeric constant. If v is a plain Variable, returns None.
- Return type:
object
- aesara.tensor.rewriting.math.is_1pexp(t, only_process_constants=True)[source]#
- Returns:
If ‘t’ is of the form (1+exp(x)), return (False, x). Else return None.
- Return type:
object
- aesara.tensor.rewriting.math.is_exp(var)[source]#
Match a variable with either of the
exp(x)or-exp(x)patterns.- Parameters:
var – The Variable to analyze.
- Returns:
A pair (b, x) with
ba boolean set to True ifvaris of the form-exp(x)and False ifvaris of the formexp(x). Ifvarcannot be cast into either form, then returnNone.- Return type:
tuple
- aesara.tensor.rewriting.math.is_inverse_pair(node_op, prev_op, inv_pair)[source]#
Given two consecutive operations, check if they are the provided pair of inverse functions.
- aesara.tensor.rewriting.math.is_mul(var)[source]#
Match a variable with
x * y * z * ....- Parameters:
var – The Variable to analyze.
- Returns:
A list [x, y, z, …] if
varis of the formx * y * z * ..., or None ifvarcannot be cast into this form.- Return type:
object
- aesara.tensor.rewriting.math.is_neg(var)[source]#
Match a variable with the
-xpattern.- Parameters:
var – The Variable to analyze.
- Returns:
xifvaris of the form-x, or None otherwise.- Return type:
object
- aesara.tensor.rewriting.math.local_add_mul_fusion(fgraph, node)[source]#
Fuse consecutive add or mul in one such node with more inputs.
It is better to fuse add/mul that way then in a Composite node as this make the inner graph of the Composite smaller. This allow to put more computation in a Composite before hitting the max recursion limit when pickling Composite.
- aesara.tensor.rewriting.math.parse_mul_tree(root)[source]#
Parse a tree of multiplications starting at the given root.
- Parameters:
root – The variable at the root of the tree.
- Returns:
A tree where each non-leaf node corresponds to a multiplication in the computation of
root, represented by the list of its inputs. Each input is a pair [n, x] withna boolean value indicating whether sub-treexshould be negated.- Return type:
object
Examples
x * y -> [False, [[False, x], [False, y]]] -(x * y) -> [True, [[False, x], [False, y]]] -x * y -> [False, [[True, x], [False, y]]] -x -> [True, x] (x * y) * -z -> [False, [[False, [[False, x], [False, y]]], [True, z]]]
- aesara.tensor.rewriting.math.perform_sigm_times_exp(tree, exp_x=None, exp_minus_x=None, sigm_x=None, sigm_minus_x=None, parent=None, child_idx=None, full_tree=None)[source]#
Core processing of the
local_sigm_times_exprewrite.This recursive function operates on a multiplication tree as output by
parse_mul_tree. It walks through the tree and modifies it in-place by replacing matching pairs (exp, sigmoid) with the desired version.- Parameters:
tree – The sub-tree to operate on.
exp_x – List of arguments
xso thatexp(x)exists somewhere in the whole multiplication tree. Each argument is a pair(x, leaf)withxthe argument of the exponential, andleafthe corresponding leaf in the multiplication tree (of the form[n, exp(x)]– seeparse_mul_tree). IfNone, this argument is initialized to an empty list.exp_minus_x – Similar to
exp_x, but forexp(-x).sigm_x – Similar to
exp_x, but forsigmoid(x).sigm_minus_x – Similar to
exp_x, but forsigmoid(-x).parent – Parent of
tree(Noneiftreeis the global root).child_idx – Index of
treein its parent’s inputs (Noneiftreeis the global root).full_tree – The global multiplication tree (should not be set except by recursive calls to this function). Used for debugging only.
- Returns:
Trueif a modification was performed somewhere in the whole multiplication tree, orFalseotherwise.- Return type:
bool
- aesara.tensor.rewriting.math.replace_leaf(arg, leaves, new_leaves, op, neg)[source]#
Attempt to replace a leaf of a multiplication tree.
We search for a leaf in
leaveswhose argument isarg, and if we find one, we remove it fromleavesand add tonew_leavesa leaf with argumentargand variableop(arg).- Parameters:
arg – The argument of the leaf we are looking for.
leaves – List of leaves to look into. Each leaf should be a pair (x, l) with
xthe argument of the Op found in the leaf, andlthe actual leaf as found in a multiplication tree output byparse_mul_tree(i.e. a pair [boolean, variable]).new_leaves – If a replacement occurred, then the leaf is removed from
leavesand added to the listnew_leaves(after being modified byop).op – A function that, when applied to
arg, returns the Variable we want to replace the original leaf variable with.neg (bool) – If True, then the boolean value associated to the leaf should be swapped. If False, then this value should remain unchanged.
- Returns:
True if a replacement occurred, or False otherwise.
- Return type:
bool
- aesara.tensor.rewriting.math.scalarconsts_rest(inputs, elemwise=True, only_process_constants=False)[source]#
Partition a list of variables into two kinds: scalar constants, and the rest.
- aesara.tensor.rewriting.math.simplify_mul(tree)[source]#
Simplify a multiplication tree.
- Parameters:
tree – A multiplication tree (as output by
parse_mul_tree).- Returns:
A multiplication tree computing the same output as
treebut without useless multiplications by 1 nor -1 (identified by leaves of the form [False, None] or [True, None] respectively). Useless multiplications (with less than two inputs) are also removed from the tree.- Return type:
object
- class aesara.tensor.rewriting.elemwise.FusionOptimizer(node_rewriter)[source]#
Graph rewriter that simply runs node fusion operations.
TODO: This is basically an
EquilibriumGraphRewriter; we should just use that.- apply(fgraph)[source]#
Apply the rewriter to a
FunctionGraph.It may use all the methods defined by the
FunctionGraph. If theGraphRewriterneeds to use a certain tool, such as anInstanceFinder, it can do so in itsadd_requirementsmethod.
- class aesara.tensor.rewriting.elemwise.InplaceElemwiseOptimizer(OP)[source]#
This is parameterized so that it works for
ElemwiseOps.- apply(fgraph)[source]#
Attempts to replace all
Elemwises by versions of them that operate inplace. It operates greedily: for eachElemwisethat is encountered, for each output, it tries each input to see if it can operate inplace on that input. If so, it makes the change and goes to the next output orElemwise.Examples
x + y + z -> x += y += z (x + y) * (x * y) -> (x += y) *= (x * y) or (x + y) *= (x *= y)
- aesara.tensor.rewriting.elemwise.is_dimshuffle_useless(new_order, input)[source]#
- Checks for two types of useless dimshuffles:
1 - dimshuffle all dimensions in order. 2 - dimshuffle a broadcastable dimension.
- aesara.tensor.rewriting.elemwise.local_elemwise_fusion(fgraph, node)[source]#
Fuse
ElemwiseOps in a node.As part of specialization, we fuse two consecutive
ElemwiseOps of the same shape.For mixed dtype, we let the
CompositeOpdo the cast. It lets the C compiler do the cast.The number of dimensions is validated at call time by Aesara itself.
- aesara.tensor.rewriting.elemwise.local_elemwise_fusion_op(op_class, max_input_fct=<function <lambda>>, maker=None)[source]#
Create a recursive function that fuses
ElemwiseOps.The basic idea is that we loop through an
Elemwisenode’s inputs, find otherElemwisenodes, determine the scalars input types for all of theElemwiseOps, construct a new scalarOpusing the scalar input types and eachElemwise’s scalarOp, and use the composite scalarOpin a new “fused”Elemwise.It’s parameterized in order to work for
ElemwiseOps.- Parameters:
op_class (type) –
Elemwiseclass (the one that we want to fuse)max_input_fct (callable) – A function that returns the maximum number of inputs that this
Elemwisecan take. On the CPU we limit to 32 input variables since that is the maximum NumPy support.maker (callable) – A function with the signature
(node, *args)that constructs anop_classinstance (e.g.op_class(*args)).
Random variables#
- aesara.tensor.random.rewriting.basic.is_rv_used_in_graph(base_rv, node, fgraph)[source]#
Determine whether or not
base_rvis used by a node other thannodeinfgraph.If a node uses
ShapeorShape_ion thebase_rv, we ignore it, because thoseOp`s don't rely on the actual sample values of `base_rv.TODO: We should apply all the shape rewrites before these rewrites, since that would properly remove the unnecessary dependencies on
base_rv(when possible).