Tree

UCX provides several low-level function for working with arbitrary tree structures, as well as some high-level functions for trees that induce a certain order on the data they store.

The following convenience macros allow you to declare and use simple tree structures.

#define CX_TREE_NODE_BASE(Type)

#define cx_tree_node_base_layout

The CX_TREE_NODE_BASE macro declares four pointers of type Type*: parent, children, last_child, prev, and next, which must be placed as first member into your struct.

You can use it for example like this:

typedef struct my_tree_node_s {
    CX_TREE_NODE_BASE(struct my_tree_node_s);
    int value;
    char *desc;
} MyTreeNode;

The macro cx_tree_node_base_layout expands to the offsets of the above-mentioned pointers. It will become handy when calling the low-level tree functions which expect all offsets to be passed as arguments.

Before diving into the function definitions, there are four function pointer declarations you should know.

#include <cx/tree.h>

typedef void *(*cx_tree_node_create_func)(
        const void *data, void *context);

typedef int (*cx_tree_search_data_func)(
        const void *node, const void *data);

typedef int (*cx_tree_search_func)(
        const void *node, const void *new_node);

typedef void (*cx_tree_relink_func)(
        void *node, const void *old_parent, const void *new_parent);

A cx_tree_node_create_func takes a pointer to the data the new node shall contain, and a context (which might be an allocator, for example). It shall allocate and return the created node, or NULL when node creation is not possible.

A cx_tree_search_data_func shall check if the node contains the data, or if a child node might contain the data. It shall return zero, when node contains the data. When a child node might contain the data, the returned positive number shall indicate how close the match is. It is not relevant if the data can actually be found in the subtree - the positive number also indicates that the data could be inserted in that subtree. Only when the entire subtree starting with node cannot contain the data, a negative number is supposed to be returned.

A cx_tree_search_func behaves exactly the same, except that it does expect a pointer to the data but a pointer to a node structure. In combination with the cx_tree_node_create_func, when inserting nodes with the high-level functions, a new_node is created first, and then an appropriate insertion point is searched with the cx_tree_search_func.

A cx_tree_relink_func is called when an intermediate node was removed from the tree and it's children need to be detached from the removed old_parent and attached to a new_parent. The function is called for every child of the removed node and can be used, for example, to update the contents of the node when needed.

Create

#include <cx/tree.h>

CxTree *cxTreeCreate(const CxAllocator *allocator,
        cx_tree_node_create_func create_func,
        cx_tree_search_func search_func,
        cx_tree_search_data_func search_data_func,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

CxTree *cxTreeCreateSimple(const CxAllocator *allocator,
        cx_tree_node_create_func create_func,
        cx_tree_search_func search_func,
        cx_tree_search_data_func search_data_func);

CxTree *cxTreeCreateWrapped(const CxAllocator *allocator,
        void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

The function cxTreeCreate() creates a CxTree structure where each node created by the create_func has the layout specified by the offset arguments. The function cxTreeCreateSimple() is exactly the same, except that it assumes the cx_tree_node_base_layout.

In both cases the tree will be created empty, that is with no nodes, not even a root node. On the other hand, cxTreeCreateWrapped() creates a CxTree structure with the specified layout and root node where root may already be the root of a larger tree.

If your wrapped tree structure does not have a last_child or a prev pointer, you may specify a negative location to indicate the missing pointer. All other pointers are mandatory and a non-negative location is expected.

Note, that a wrapped tree by default has no create or search functions assigned. Therefore, if you wish to use one of the functions below that needs those function pointers set, you will have to set them manually by assigning to the respective fields in the CxTree structure.

Example for wrapping a libxml2 tree

In this example we wrap the XML tree of the commonly used libxml2 library.

#include <libxml/tree.h>
#include <cx/tree.h>

CxTree *wrap_xml_tree(xmlNode *root) {
    return cxTreeCreateWrapped(
        cxDefaultAllocator, // or you can just write NULL
        root,
        offsetof(xmlNode, parent),
        offsetof(xmlNode, children),
        offsetof(xmlNode, last),
        offsetof(xmlNode, prev),
        offsetof(xmlNode, next)
    );
}

You do not need to specify any function pointers or destructors in the CxTree structure, if you just want to use the resulting CxTree for iterating over the nodes.

You can, for example, print the XML structure with the following code:

// continued from above

void print_xml_structure(CxTree *tree) {
    // specify visit_on_exit argument = true
    // so that we are visiting each node twice: on enter and on exit
    CxTreeIterator iter = cxTreeIterate(tree, true);
    
    cx_foreach(xmlNode*, node, iter) {
        if (node->type == XML_ELEMENT_NODE) {
            // the exiting field from the iterator indicates
            // whether we are entring or leaving the subtree
            // of the current element 
            printf("%s - %s\n",
                node->name,
                iter.exiting ? "start" : "end"
            );
        }
    }
    cxTreeIteratorDispose(&iter);
}

Add Nodes

#include <cx/tree.h>

void cx_tree_link(void *parent, void *node, ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

extern unsigned int cx_tree_add_look_around_depth;

size_t cx_tree_add_iter(struct cx_iterator_base_s *iter, size_t n,
        cx_tree_search_func sfunc, cx_tree_node_create_func cfunc,
        void *cdata, void **failed, void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);
        
size_t cx_tree_add_array(const void *src, size_t n, size_t elem_size,
        cx_tree_search_func sfunc, cx_tree_node_create_func cfunc,
        void *cdata, void **failed, void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);
        
int cx_tree_add(const void *src,
        cx_tree_search_func sfunc, cx_tree_node_create_func cfunc,
        void *cdata, void **cnode, void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

The low-level function cx_tree_link() adds a node to a new parent, unlinking it from its previous parent, if required.

The functions cx_tree_add_array() and cx_tree_add_iter() are equivalent, except that the former adds n items of size elem_size from the src array, and the latter adds up to n items from the iterator iter. For each element, the cfun is called with the element as first argument and the cdata context as second argument. The cfun function is supposed to return a node for which then the sfunc is used to find the location where to insert the node. If a node could not be added, because sfunc always returns a negative number, it is stored at the location where failed points to.

The function cx_tree_add() is similar to the other add-functions, except that it does only add one single item to the tree and always stores the created node at the location where cnode points to.

The following high-level functions can be used to add nodes to a CxTree.

#include <cx/tree.h>

int cxTreeAddChild(CxTree *tree, void *parent, const void *data);

void cxTreeAddChildNode(CxTree *tree, void *parent, void *child);

void cxTreeSetParent(CxTree *tree, void *parent, void *child);

int cxTreeInsert(CxTree *tree, const void *data);

size_t cxTreeInsertIter(CxTree *tree, CxIteratorBase *iter, size_t n);

size_t cxTreeInsertArray(CxTree *tree, const void *data,
        size_t elem_size, size_t n);

The functions cxTreeAddChild(), cxTreeAddChildNode(), and cxTreeSetParent() are similar to cx_tree_link(). The difference between cxTreeAddChild() and cxTreeAddChildNode() is, that cxTreeAddChild() uses the cx_tree_node_create_func of the CxTree to create the node first, before adding it. The function returns non-zero in case the creation of the node fails. The difference between cxTreeSetParent() and cxTreeAddChildNode() is, that cxTreeSetParent() does not increase the tree size, when child already had a parent.

The functions cxTreeInsert(), cxTreeInsertIter(), and cxTreeInsertArray() behave like cx_tree_add(), cx_tree_add_iter(), and cx_tree_add_array(). As creation context cdata for the cx_tree_node_create_func a pointer to the CxTree is passed. Instead of returning a failed node, like the low-level functions, the high-level functions automatically free the memory for the failed node.

Size and Depth

#include <cx/tree.h>

size_t cxTreeSubtreeSize(CxTree *tree, void *subtree_root);

size_t cxTreeSubtreeDepth(CxTree *tree, void *subtree_root);

size_t cxTreeDepth(CxTree *tree);

The function cxTreeSubtreeSize() counts all nodes belonging to the subtree starting with subtree_root.

The function cxTreeSubtreeDepth() reports the maximum depth of the subtree starting with subtree_root. If the subtree_root does not have any children, this results in a depth of one.

The function cxTreeDepth() is equivalent to cxTreeSubtreeDepth() where subtree_root is the root node of the entire tree.

Search

#include <cx/tree.h>

#define CX_TREE_SEARCH_INFINITE_DEPTH

int cx_tree_search_data(const void *root, size_t max_depth,
        const void *data, cx_tree_search_data_func sfunc,
        void **result, ptrdiff_t loc_children, ptrdiff_t loc_next);

int cx_tree_search(const void *root, size_t max_depth,
        const void *node, cx_tree_search_func sfunc,
        void **result, ptrdiff_t loc_children, ptrdiff_t loc_next);
        
void *cxTreeFind(CxTree *tree, const void *data);

void *cxTreeFindInSubtree(CxTree *tree, const void *data,
        void *subtree_root, size_t max_depth);

The functions cx_tree_search_data() and cx_tree_search() are equivalent, except that cx_tree_search_data() compares a currently investigated node against the data with the specified cx_tree_search_data_func, and cx_tree_search() compares against a node with the specified cx_tree_search_func.

Usually you the latter is rarely needed. It is used, for example, by cx_tree_add() to find the correct position where to add a freshly created node.

The search proceeds as follows, starting with the root node:

The current node is compared with data/node using the sfunc.
If sfunc returns zero, a matching node has been found and the pointer to that node is stored at the location given by result.
If sfunc returns a negative number, the entire subtree is skipped.
If sfunc returns a positive number, the subtree is searched, unless the number is larger than the number of an already searched subtree. The best match will be stored at the location given by result.
The return value will be the return value of the sfunc for node that was stored in the result, or a negative number when no match was found

The function cxTreeFind() uses the search_data_func of the CxTree to find the data in the tree, and returns a pointer to the node when the data was found (or NULL, otherwise).

The function cxTreeFindInSubtree() is equivalent, except that it restricts the search to nodes in the subtree starting with (and including) subtree_root, and skipping all nodes below the max_depth. Note, that the max_depth is specified in relation to the subtree_root and not in relation to the entire tree.

Iterator and Visitor

#include <cx/tree.h>

CxTreeIterator cx_tree_iterator(void *root, bool visit_on_exit,
        ptrdiff_t loc_children, ptrdiff_t loc_next);

CxTreeIterator cxTreeIterate(CxTree *tree, bool visit_on_exit);

CxTreeIterator cxTreeIterateSubtree(CxTree *tree,
        void *node, bool visit_on_exit);

#define cxTreeIteratorContinue(iter)

void cxTreeIteratorDispose(CxTreeIterator *iter);

CxTreeVisitor cx_tree_visitor(void *root,
        ptrdiff_t loc_children, ptrdiff_t loc_next);

CxTreeVisitor cxTreeVisit(CxTree *tree);

CxTreeVisitor cxTreeVisitSubtree(CxTree *tree, void *node)

#define cxTreeVisitorContinue(visitor)

void cxTreeVisitorDispose(CxTreeVisitor *visitor);

There are two different kind of iterators for trees. The CxTreeIterator is performing a depth-first iteration with the capability of visiting a node twice: first when the iterator enters the node coming from its parent, and secondly when the iterator tracks back from its last child. This behavior is controlled via the visit_on_exit argument. When set to true, the iterators exiting flag can be checked during iteration to see whether the iterator is currently entering or leaving the node. The above example for iterating through an XML tree illustrates this.

On the other hand, the CxTreeVisitor performs a breadth-first iteration and visits every node only once.

Since tree iteration needs to keep track of a stack (depth-first) or queue (breadth-frist), internal memory is allocated. This memory is automatically disposed when the iteration completes. If you break from the iteration early, you must call cxTreeIteratorDispose() or cxTreeVisitorDispose(), respectively, to deallocate the memory.

The macros cxTreeIteratorContinue() and cxTreeVisitorContinue() equivalently instruct the iterator/visitor to skip the subtree below the currently inspected node.

Remove

#include <cx/tree.h>

void cx_tree_unlink(void *node, ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

int cxTreeRemoveNode(CxTree *tree, void *node,
        cx_tree_relink_func relink_func);

void cxTreeRemoveSubtree(CxTree *tree, void *node);

The low-level function cx_tree_unlink() removes the specified node, as well as the entire subtree beneath that node, from its parent. The high-level counterpart is cxTreeRemoveSubtree().

The function cxTreeRemoveNode(), on the other hand, only removes the node from the tree, links all children of node to the former parent of node, and calls the optional relink_func for every former child of node which will be relinked to a new parent. Therefore, calling cxTreeRemoveNode() on the root node is an error and the function returns non-zero. In all other cases, the function returns zero.

Dispose

#include <cx/tree.h>

int cxTreeDestroyNode(CxTree *tree, void *node,
        cx_tree_relink_func relink_func);

void cxTreeDestroySubtree(CxTree *tree, void *node);

void cxTreeClear(CxTree *tree);

void cxTreeFree(CxTree *tree);

The function cxTreeDestroyNode() first removes the node from the tree, like cxTreeRemoveNode(), and then invokes the destructor functions. It is guaranteed, that the simple destructor is called before the advanced destructor. If no destructor function is registered at all, the behavior is identical to cxTreeRemoveNode(). That means, this function does not deallocate the memory for the node on its own and leaves that entirely to the destructor functions.

The function cxTreeDestroySubtree() performs the above actions for the entire subtree starting with node. The order in which the destructor functions for the nodes of the subtree are called are determined by a tree iterator.

The function cxTreeClear() is a shorthand for invoking cxTreeDestroySubtree() on the root node of the tree.

The function cxTreeFree() behaves like cxTreeClear() and then deallocates the memory for the CxTree structure.

Last modified: 06 April 2025