FreshBI

View Original

DAX-JUNGLE: PATH

It’s a jungle out there

Back in the day- when I was stuck on a DAX problem, I used to toggle through the IntelliSense in PowerBI one letter at a time. I’ve learned much since then and in this blog I’d like to share my experience with using PATH in Dax.

A:

  • ABS

  • ACOS

  • ACOSH

B:

  • BETA.DIST

  • BETA.INV

  • BLANK

Etc….

Hours wasted..

Mistakes were made

A MUCH better use of my time would have been reviewing quality solutions to real world problems.

So that’s what we’ll do. Real problems, real DAX.


The Basics: PATH

The PATH function is unique in DAX, it solves a simple problem very easily. Lets have a look at the example in Microsoft’s Documentation:

Example from docs.microsoft.com

We can see from just two columns the PATH function gets the traversal through employee "parents”. You can think of parent employees as managers, for example. How can we visualize what is going on here?

Enter: MATHEMATICS

My background is in Math, and I have a really simple tool to help understand what is happening in the example above. In Mathematics “Graph Theory” is a tool used to understand graphs… Specifically, relationship-graphs, or collections of nodes and their connections to each other.

We can construct a graph out of the example above. We will create a “node (dot)” for each employee, and a "relationship (arrow)” from a node’s parent to its “child”. Using this process yields the following:

Visualizing a “parent-child” relationship table in this way helps us understand what is going on. I’ve highlighted the the route from 112 to 11. Compare this to the Path value “112|14|3|11” for EmployeeKey 11 in the table from before. The path function returns the unique path from the root parent to the node evaluated against.

What did we just learn?

We found a simple way to visualize the PATH function on a parent child relationship table utilizing Graph Theory, but we have one major restriction:

Each node can only appear once in the “node” column. Each node can have multiple children, but not multiple parents.

We can apply the PATH function to graphs, but currently, only Hierarchical Tree Graphs:

Can we push it further? Workflow Diagrams (Almost):

As a math guy, I wanted to see if we can push this comparison further. Are there other types of graphs we can use which provide expanded use cases?

Have a look at the workflow below:

This is a very basic example of a log in workflow. AND it is a graph! Each node is a step in the process, and each relationship shows potential next steps. What are some properties we can observe.

  • This is not a tree: There are two “enclosed” spaces in this workflow.

  • Some nodes have multiple parents: “Profile” can be reached from “Log In” and “Register”.

  • This has more applications than just hierarchical trees!

  • This has no loops. Workflow diagrams can have loops but this one doesn’t. We are going to steer clear of those for today.

Punchline: Can we use PATH to find ALL paths to ALL nodes in a workflow diagram without loops (Directed Acyclic Graph, in math terms)? YES!


Example:

Now that we have seen a brief overview, we are going to walk through an example of the logical process needed to decompose a Directed Acyclic Graph (DAG) into something that PATH can be used on. Let’s put our problem solving hats on!


Lets look at the following example:

What problems do we have? Recall the PATH function has a MAJOR requisite:

Each node must be unique in the “Node” column.

How can we remedy this? Let’s try and see if we can transform our graph into something that has our desired property, but satisfies the requirement above.

Lets “split” our duplicate nodes into copies for each parent. And enumerate them (give them a happy little digit, to make them feel special and unique). Have a look at how this looks in our graph:

Can we programmatically apply this same update to our matrix? Well, of course we can!

First, enumerate the Nodes in our table to make each unique. This is easy!

Next, for each parent, we must update their nodes. This is a bit more tricky.

Fore each parent node, we have to join the new, enumerated node(s). As shown below:

UHOH! b is a parent, and it was split into multiple nodes. Have another look at our updated graph. d2 does indeed have two parents. So we must duplicate that row, once for each of d2’s parents. Viola! We’ve done it?…

Our duplication of d2 has created duplicates of d2! What do we do now? We still need unique nodes in the Node Column.

Turns out we are in the same place as before, just different, so… recursion! We repeat the above process as many times as needed to result in unique nodes! Have a look at the end result. I wont go through a formal proof that this is equivalent, but take a moment and try to convince yourself that unique paths are conserved throughout this process.

We now have unique values in each of our nodes. And take a closer look… Look familiar? Let me rearrange.

In trying to solve the problem of getting unique nodes into our “Node” column, we’ve build an algorithm that “unpacks” a DAG into a tree with all possible paths to nodes represented as paths to the new, enumerated nodes!

Power BI:

This hasn’t been much about Power BI yet… Lets change that.

First of all, how do we enter data to get a matrix?

  • Step 1: Enter your graph’s Node - Parent relationship table in Power Query.

  • Step 2: Run this Power Query function below…

……

The contents of this is definitely outside the scope of this blog, so here it is!

This recursively applies the algorithm we discovered earlier in this blog. Feel free to contact me if you have any questions.

Let’s see this in Power BI, shall we! We’ll start with a new Directed Acyclic Graph node - parent relation ship matrix.

We have one key observation to make. Notice, that a and b are their own parents. We must have a and b present in the node list, despite them not having any parents. This is a very easy workaround.

Next, apply the “Unwrapper” Power Query function above.

Cool! Our function had to go through 4 recursions to decompose our DAG.

There is one more step we’ve got to do before we head to the report canvas. We’ll add two columns that get the “BaseNodes” for parents and Nodes. This simply extracts the node “name” without the enumerations.

One more step! We’ll make a new calculated column in our Unwrapped DAG table.

PATH = PATH('Unwrapped DAG'[Node],'Unwrapped DAG'[Parent])

Now we can place these into some visuals.

This isn’t quite ideal, we can wrap the PATH function in a few substitutes.

See this content in the original post

Very cool!


Summary

Functions like PATH seem niche, but they can have more applications than meets the eye. I had fun exploring Directed Acyclic Graphs in Power BI, and I hope this exploration opened your eyes to some of the possibilities hidden in these obscure functions.

There are many ways we find applications of this even further. By having a “cost” associated with each node, we can apply this same algorithm to show the “cost” of each path by adding up the cost of crossing each node in a path.

There is one final observation I want to make on this journey. The PATH function didn’t solve a problem on it’s own, but rather inspired a new and more in-depth problem that we could grasp. We solved this problem exclusively with logic (math), with the goal of using the PATH function at the end.

Keep looking for problems that push Power BI to places that don’t seem likely. It’s worth the journey!

Let’s Chat

At FreshBI, we’ve lowered the barriers, risk and cost of developing world-class Power BI dashboards so that you can unlock the value in your data.

Contact us through the scheduling app to start a conversation about how our data visualization consultants can design your best Power BI dashboards today.


Our Latest Blogs

See this gallery in the original post

About FreshBI

Operating throughout USA and Canada, FreshBI is a boutique Microsoft Partner using Power BI to unlock the value trapped in your data so that your business can succeed.

See this social icon list in the original post