This page provides technical implementation details for potential contributors or the curious. You don’t need to read this to be able to use greta.

greta arrays, nodes and tensors

There are three layers to how greta defines a model: users manipulate greta arrays, these define nodes, and nodes then define Tensors.

greta arrays

greta arrays are the user-facing representation of the model. greta arrays are actually just lists, but with the classes greta_array and array.

x <- ones(3, 3)
[1] TRUE
[1] "greta_array" "array"      


The only list element of a greta array is node, an R6 object inheriting from the R6 class ‘node’, as well as one of the three node types: ‘data’, ‘operation’ or ‘variable’.

[1] "data_node" "node"      "R6"       

There is a fourth node type: ‘distribution’, but these are never directly associated with greta arrays.

These R6 node objects are where the magic happens. When created, each node points to its ‘child’ nodes - the nodes for the greta arrays that were used to create this one.

# data nodes have no children
[1] 0
# operation nodes have one or more children
z <- x * 3
[1] 2

Each node also has a list of its parents, the nodes that have been created from this one.

When model() is called, that inheritance information is used to construct the directed acyclic graph (DAG) that defines the model. The inheritance also preserves intermediate nodes, such as those creates in multi-part operations, but not assigned as greta arrays.

Nodes also have a value member, which is an array for data nodes or an ‘unknowns’ object for other node types. The unknowns class is a thin wrapper around arrays, which prints the question marks. Generic functions for working on arrays (e.g. dim, length, print) make use these node values to return something familiar to the user.

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    1
[3,]    1    1    1
# R calls this a matrix because it's 2d, but it's an array
[1] "matrix"
     [,1] [,2] [,3]
[1,]  ?    ?    ?  
[2,]  ?    ?    ?  
[3,]  ?    ?    ?  
[1] "unknowns" "matrix"  


In addition to remembering their shape and size and where they are in the DAG, each node has methods to define a corresponding TensorFlow Tensor object in a specified environment. That doesn’t happen until the user runs model(), which creates a ‘dag_class’ object to store the relevant nodes, the environment for the tensors, and methods to talk to the TensorFlow graph.

The node tf() method takes the dag as an argument, and defines a tensor representing itself in the tensorflow environment, with a name determined by the dag object.

function (dag) {
             tf$constant(self$value(), dtype = tf_float()),
             envir = dag$tf_environment)
<environment: 0x12526f350>

Because R6 objects are pass-by-reference (rather than pass-by-value), the dag accumulates all of the defined tensors, rather than being re-written each time. Similarly, because nodes are R6 objects and know which are their children, they can make sure those children are defined as tensors before they are. The define_tf() member function ensures that that happens, enabling operation nodes to use the tensors for their children when defining their own tensor.

function (dag) {

      # switch out the op for non-sugared variety
      op <- self$switch_op(self$operation)

      # get the function
      fun <- eval(parse(text = op))

      # fetch the tensors for the environment
      arg_tf_names <- lapply(self$children, dag$tf_name)
      args <- lapply(arg_tf_names, get, envir = dag$tf_environment)

      # fetch additional (non-tensor) arguments, if any
      if (length(self$operation_args) > 0)
        args <- c(args, self$operation_args)

      # apply function on tensors
      node <-, args)

      # assign it in the environment
      assign(dag$tf_name(self), node, envir = dag$tf_environment)

<environment: 0x11d6bf838>

variables and free states

Hamiltonian Monte Carlo (HMC) requires all of the parameters to be transformed to a continuous scale for sampling. Variable nodes are therefore converted to tensors by first defining a ‘free’ (unconstrained) version of themselves as a tensor, and then applying a transformation function to convert them back to the scale of interest.

a <- variable(lower = 0)
[1] "variable_node" "node"          "R6"           
function (x, env) {

      upper <- self$upper
      lower <- self$lower

      if (self$constraint == 'none') {

        y <- x

      } else if (self$constraint == 'both') {

        y <- tf$nn$sigmoid(x) * fl(upper - lower) + fl(lower)

      } else if (self$constraint == 'low') {

        y <- fl(upper) - tf$exp(x)

      } else if (self$constraint == 'high') {

        y <- tf$exp(x) + fl(lower)

      } else {

        y <- x



<environment: 0x10b5aef78>


distribution nodes are node objects just like the others, but they are not directly associated with greta arrays. Instead, greta arrays may have a distribution node in their distribution slot to indicate that their values are assumed to follow that distribution. The distribution node will also be listed as a parent node, and likewise the ‘target node’ will be listed as a child of the distribution. Distribution nodes also have child nodes (data, variables or operations) representing their parameters.

b <- normal(0, 1)
[1] "variable_node" "node"          "R6"           
[1] "normal_distribution" "distribution_node"   "node"               
[4] "R6"                 
# a is the target of its own distribution
[1] "variable_node" "node"          "R6"           

When they define themselves as tensors, distribution nodes define the log density of their target node/tensor given the tensors representing their parameters.

function (dag) {

      # fetch inputs
      tf_target_node <- self$get_tf_target_node()
      tf_target <- get(dag$tf_name(tf_target_node),
                       envir = dag$tf_environment)
      tf_parameters <- self$tf_fetch_parameters(dag)

      # calculate log density
      ld <- self$tf_log_density_function(tf_target, tf_parameters)

      # check for truncation
      if (!is.null(self$truncation))
        ld <- ld - self$tf_log_density_offset(tf_parameters)

<environment: 0x127249e98>

If the distribution was truncated, the log density is normalised using the cumulative distribution function.

Joint density

Those log-densities for these distributions are summed on the TensorFlow graph to create a Tensor for the joint log-density of the model. TensorFlow’s automatic gradient capabilities are then used to define a Tensor for the gradient of the log-density with respect to each parameter in the model.

The dag R6 object contained within the model then exposes methods to send parameters to the TensorFlow graph and return the joint density and gradient.

model <- model(b)
function (parameters, flat = TRUE) {

      # convert parameters to a named list and change TF names to free versions
      parameters <- relist_tf(parameters, self$parameters_example)
      names(parameters) <- paste0(names(parameters), '_free')

      # create a feed dict in the TF environment
      self$tf_environment$parameters <- parameters
           parameter_dict <-, parameters))

<environment: 0x10867c878>
function(adjusted = TRUE) {

                   sess$run(joint_density_adj, feed_dict = parameter_dict)))

<environment: 0x10867c878>
function (adjusted = TRUE) {

                   sess$run(gradients_adj, feed_dict = parameter_dict)))

<environment: 0x10867c878>

These methods are used by MCMC samplers to explore the model parameters.