Simplify the bitemporal Neo4J query with a user-defined function
In the previous post we have seen how we can translate the logic of correctly describing the historical events from the first post into a Cypher query.
To remind you, it looks like this:
It is certainly a bit unwieldy, considering that you want historical data maybe not only from a single node, but other ones, too. A good idea seems to be to let the query writer pick out those nodes that require consideration and then have a function that contains the logic of picking up the correct ones. Let us do just that.
In Neo4J you can also write “stored procedures” and “user-defined functions” (UDF). Depending on your upbringing you may dislike the idea of putting “logic” into the database engine. However, there are two differences to stored procedures and functions as we know them from Oracle and Sql server:
- We can write the Neo4J code to be run as procedure / function in Java (or some other JVM language). Depending on how you write the other parts of your application this may considerably help in traslating your skills to the DB.
- Neo4J provides infrastructure that allows you to fire up an embedded instance and use it within your tests to actually test your procedures / functions close to real-life scenarios.
With that in mind you may feel that not much speaks against encapsulating some technical aspect of your DB model into a function.
The implementation of the UDF is pretty straightforward:
Of note is the UserFunction
-annotation, which did not exist until recent
versions of Neo4J.
Before you had to define your user procedures as, well, procedures, making the
call of your procedures from within Cypher a little bit more involved (CALL...YIELD
).
Also remember that all numbers defined on nodes / relationships will appear as Longs.
Finally, there is a sort
-method on the List
-interface, however, the implementation
that Neo4J passes to you appears to not implement it, hence the external sort
via the Stream
-utilities.
The function assumes that you send in data of a certain shape. It will then be able to sort the nodes correctly and dismiss those nodes that are “invisible” within the given set.
By compiling our UDF and adding the resulting jar to Neo4J’s plugins folder, we can start using it:
As you can see we have a nice distribution of responsibilities: Query the correct set of data you are interested in, and use the function to project them correctly.
For completeness’ sake we can also provide a function that gives us the current state for some query:
This function uses the previously established UDF as well as a node labelled
“Today” which should contain the numeric value of today (this can obviously
be solved differently, typically by using something like Epoch in seconds).
Since the data already comes out newest to oldest from the bitempProjection
-
method, we can just pick the first item where the actual date is smaller than
today.
With that we can get the current state of our person: