In my previous post about Graph databases, we went through the introductions to NoSQL and Neo4j, tried to understand what a graph storage is, and how it makes our lives a little easier. We also had a sneak peek into the workings of the SQL through the example of the friends-of-friends scenario and looked how Cypher, the query language for Neo4J database, makes things simpler by easing the syntax, making your queries more readable and expressive at the same time.
Now, before we dive deep, let’s skim the surface for a little while, shall we? Learning the syntax of Cypher is pretty straightforward, applying it in real life situations to tailor to your needs is the part that needs more attention.
You might be wondering if you can just go through the Cypher documentation and gather all you need from there?
Well, yes and no. Yes, because frankly, the Cypher documentation is one of the most comprehensive and detailed documentations that you will come across which, if you go through it chapter by chapter and example by example, will most definitely lay down an amazing groundwork for you to become a Neo4j veteran in no time. And no, because some of the examples are disjoint (according to me, at least), i.e. if example A shows you how to construct a query using START, it will most probably not tell you how to use the built-in functions, or creating collections. You get the drift.
Now with all that in mind, let’s commence operation “Neo4j and Cypher” with a few definitions to help get you in the groove as soon as possible:
- Neo4j is a graph database management system and the subject of the this post. It employs the mathematics of graphs and utilizes its huge potential for fast information extraction speeds to store information in the form of nodes and inter-node relationships.
- Cypher: What the SQL is to relational database management systems like MySQL, Microsoft SQL etc., Cypher is to Neo4j: the language which helps structure all the queries in Neo4j.
Now you might also wonder: What exactly would I gain from learning about Neo4j and Cypher? To that, I say, “Excellent question!” If you want to learn the intricacies of how graphs databases work and help you in ways SQL never could, decreased retrieval time among many others, and would like to know how it is achieved with the help of Cypher, then learning Neo4j might lead you to that goal.
In this post, we will have a look at how to install your own local instance of Neo4j, a more detailed look at the syntax of Cypher, how information retrieval and modification is achieved using it, and as promised, introduce a scenario that will most probably make the understanding process fun and seemless. So, here we go!
Installation and the Cypher syntax
As was touched upon in the previous article, Cypher enables you to create a database based on graphical storage, i.e. it manages information in the form of nodes and relationships. Consider the following example:
Ms. Alpha Abbott works in the same company (RandomGreekAlphabets Corp.) as Mr. Lambda Longbottom, who are both supervised by Ms. Gamma Greyback, who has been working in the company since 2002.
The next image is a correct representation of the aforementioned situation:
The screenshot above has been grabbed from the local Neo4j Webadmin’s Data browser view. How did I do this, you ask? Well, here’s the modus operandi:
- Go to the URL http://neo4j.com/download/other-releases/ for selecting either a *.zip archive or an executable.
- Once downloaded, run the file as an Administrator, agree to the user license, select the destination directory and voila! Neo4j is installed in your system.
- Now, go to
$destination_directory\Neo4j Community\binand execute the file
- The Neo4j Community dialog box which opens mentions the database location which we are going to work with later on. It basically is where Neo4j stores all the nodes, the inter-node relationships, the indices etc.
- Click the Start button to fire up the Neo4j server, which sits on top of the port 7474 and wait while it starts up. Provide it with all the accesses which you want, in case you happen to see a Security Window Alert window.
- Once that is done, you will see a green status strip informing you that Neo4j is ready.
- Enter the password and click Connect.
- The browser is quite user-friendly. Play around it with for some time to get the proper hang of it. You might want to go through some tutorials and walk-throughs in the process, at the pace which you find most comfortable. There’s a left-side panel where you can find most of the options like System configurations, creating and accessing nodes in a jiffy, the REST API and the information about styling your graphs.
- The textarea on the top is where all the queries, which we are going to go through in a while, go. Type/ copy them and you will see the magic unfurl before you. Bon Voyage! Remember to stop the server by clicking the Stop button in the Neo4j Community dialog box once you are finished.
Click on the link provided: http://localhost:7474/. Check if the browser view is in tune with or something like the screenshot displayed below, without any issues. If yes, congratulations! If no, retrace your steps again to see if something inadvertently got missed.
One last thing: the startup window won’t look the same the next time you decide to start the Neo4j instance, it may look something like the following:
You might also want to look up how it should look currently by visiting the Neo4J docs page in case you desire so.
This view which is seen is called the browser view. There is a more technical-information oriented view of Neo4j, called the Webadmin view, which may be accessed in one of the following two ways:
- Open http://localhost:7474/webadmin.
- Click the information button, (i), on the left-side panel to reveal the Information Menu, and click the hyperlink “Webadmin” present at the bottom.
In the Webadmin view you will find many options which we can safely ignore until we’re administering an actual production database. This console displays information such as the number of nodes created till date, the memory being utilized, along with a graph which updates itself every 3 seconds to reflect the changes brought forth by your data manipulation queries.
There are two main panels which we are going to discuss next, apart from the Dashboard, displayed above: the Data Browser panel and the Console panel.
The Data browser looks something like the following:
It is where you can get the graph snapshots for your data, like the one for which this post is about. And the following image is of the Console panel, where you can write and execute the queries, old-school style!
The Cypher script used for the generation of the above storage structure, with the helpful comments, can be found right here: query0.cql.
You can either copy it to the browser view textarea which we discussed a while back, or you can copy it in the Console view above. Wait for the query to execute. Once that is done, write the following query to see the output:
MATCH n RETURN n;
Now, just as there are two ways to execute your queries: the Neo4j browser view and the Console view, similarly there are two ways of viewing the results of the above query in the graphical form: the Neo4j browser itself, and the Data Browser.
- In case you execute the previous two queries on the browser, click the “Graph” button on the left hand side of the query output sub-panel, which should give you something like the following:
In case you executed the query on the Webadmin Console view, then you might need to head over to the Data Browser panel, and type the following query in the textarea on the top and click the search button to reveal the following graph display, whose style can be customized in any way which suits your needs.
START root=node(0) MATCH n RETURN n
Now that we have gone through the process of installation and a basic introduction to the process of node and relationships creation, we are finally ready to have a look at how information retrieval, in which graphs are supposed to score immensely over all other methods, takes place.