The Family Tree Software Series

10 most used functions in Gremlin #3

An article explaining the meaning of the most used commands

gksriharsha

--

Source Building blocks to derive meaning

The following article is the third in the series where we will be exploring building software using a graph database. In the previous articles, I have discussed the motivation and design aspects of building a family tree. In this article, I would be going over the most used commands in gremlin.

Gremlin is a graph traversal language and virtual machine developed by Apache TinkerPop of the Apache Software Foundation.

As an analogy, Gremlin to the graph database is like SQL to the relational database. In other words, it is a query language that we can use to understand and retrieve the data from the database (OLTP — Online Transactional Process operation). Gremlin also has OLAP — Online Analytics Process mode of operation. In this mode, gremlin is used as a language for some analytics programs such as Spark DSL. The advantage being the direct use of language without learning anything out of the box. This mode of operation is out of scope for this application as we are not using any analytics engine.

Gremlin traversal commands

Gremlin traversal language has a structure to it. Functions such as next(), hasNext(), iterate() are mandatory for the execution of our queries. Along with these functions, there are other functions such as addV(), repeat(), etc. which are frequently used as a part of our query. In this article, I would be exploring some of the basic functions that I have used throughout the application development. The official documentation for this can be found here.

next(), hasNext(), iterate()

next() is a function used at the end of a query to retrieve the “next” node in the result. If the query only has one result, calling next() would retrieve it. On the other hand, if the query has multiple results, calling next would return the first result from the list. The iterate() function should be used if all the nodes from the result are to be returned.

If next() is called when the result of the query is empty, an error called StopIteration will be thrown. To prevent this error, there is a function called hasNext(). If the return value is true for hasNext(), next() can be applied for the same query.

Note: Gremlin Shell has the function next built into it, therefore if the query is submitted without the next keyword, it still gets executed as expected. In every other application, it is mandatory to use the next() function. Therefore it is a good practice to learn to write the query with the next() keyword. This is because the queries will be sent from other languages such as Python.

addV(), addE(), drop()

To add vertices to the graph, we use the addV function. Similarly, addE() is to add directional edges from one vertex to another. We can use a traversal command both to ignore the direction of traversal. drop() function is used to delete edges or vertices.

In family tree software, I have used these commands to create people (Vertices) and the relations between them (Edges). The people in the family tree may pass away but there the records of their existence are still kept in the database, therefore our software does not use the drop() command.

repeat(), until(), has(), dedup()

repeat() function is analogous to a while loop and until() function is analogous to the condition which stops the loop. The commands passed as parameters to the repeat() function will be repeated until the condition in the until() function is met.

has() function is used to compare a vertex property to a value. This returns true or false. This true or false will control the iteration of until() function.

dedup() is a function that ensures that cyclic paths are avoided in the traversal from the source node to the destination node. If this function is not used, the traverser gets stuck in a loop and the traversal will never be completed.

In the family tree software, these functions were used for traversing from one person to another. If I wanted to know how a person is related to me, I can write a query using all these commands to get the appropriate traversal.

Summary

In this article, I have shared the top 10 gremlin commands that I have used during the development of this project. I you find this article informative, please clap for the article. In the next article, I will be sharing some details on graph database design.

--

--