Graph Databases 101

James Rowlands / @jrowlands

What are we going to cover?

  • What is a Graph?
  • Why use a Graph DB?
  • Graph vs RDBMS
  • Using a Graph DB
  • Demo Time!

In mathematics, and more specifically in graph theory, a graph is a representation of a set of objects where some pairs of objects are connected by links.

http://en.wikipedia.org/wiki/Graph_%28mathematics%29

Property Graphs

  • Graph made up of nodes and relationships
  • Relationships have a name
  • Both can have properties
  • Nodes can have labels

What is a graph DB

Optimized to store and traverse graphs

Stores elements with direct pointers to adjacent elements

Avoids index look ups

What are they good for?

  • Friends of Friends
  • Recommendations
  • People affected by outage
  • Route finding
  • Fraud Detection
  • Forcing Prime Ministers to Resign

Graph DB vs the Others

Key Value Store Document Store RDBMS Graph DB
Redis MongoDB SQL Server Neo4j
Caching Store/Retrieve Documents Calculations e.g. Average Sales Linking multiple entity types

Why should you use it?

  • Speed
  • Ease of use
  • ACID compliant
  • Additive Modelling
  • Did I mention speed?

Graph Databases

Source: http://db-engines.com/en/ranking/graph+dbms

Talking Graphically

  • Cypher
  • SPARQL (W3C)
  • Gremlin

Introducing

Features

  • Dual Licencing
  • Easy to install and develop
  • Schema optional
  • Stable
  • Enterprise Features:
      ACID Compliant
      Transactions
      High Availability
      Indexes on node properties

How do I get it?

Download from www.neo4j.org

How do I talk to neo4j?

  • Rest API
  • Console
  • Language Libraries
  • CSV Import

Lets Get Technical

Create Node

Cypher:

CREATE (n:Person {name:'James'}) RETURN n;
Rest API:

POST http://localhost:7474/db/data/node
{ "name" : "James" }
POST http://localhost:7474/db/data/node/{node-id}/labels 
"Person"

Create Relationship

Cypher:

MATCH (n1:Person), (n2:Person)
WHERE n1.name = 'James' AND n2.name = 'Richard'
CREATE n1-[:FRIENDS_WITH]->n2;
Rest API:

POST http://localhost:7474/db/data/node/{from-node-id}/relationships
{
  "to" : "http://localhost:7474/db/data/node/{to-node-id}",
  "type" : "FRIENDS_WITH"
}

Retrieve Data

Cypher

MATCH (n:Person {name:'James'})-[r:FRIENDS_WITH*0..2]->(f) RETURN n,r,f;
                            
Rest API
POST http://localhost:7474/db/data/cypher 
{
  "query" : "MATCH (n:Person {name:'James'})
                -[r:FRIENDS_WITH*0..2]->(f) 
                RETURN n,r,f;",
  "params" : {
	"props" {
		"name" : "James"
	}
  }
}

Other Cypher Features

  • Union
  • Aggregation
  • Order By
  • Limit
  • Explain
  • Map Reduce

Finding Somewhere To Eat

Places I have rated

MATCH (james:Person)-[my:RATED]->(restaurant:Restaurant)
WHERE james.name = 'James'
RETURN james.name, my.rating, restaurant.name;

Other people who have rated same place

MATCH (james:Person {name:'James'})-[my:RATED]->
    (restaurant:Restaurant)<-[their:RATED]-(person:Person)
RETURN person.name, their.rating, restaurant.name 
ORDER BY person.name;

Similar Ratings to me

MATCH (james:Person {name:'James'})-[my:RATED]->
                        	(restaurant:Restaurant)<-[their:RATED]-(person:Person) 
WHERE their.rating >= (my.rating - 1) 
    AND their.rating <= (my.rating + 1) 
WITH COUNT(person) as number_ratings, person 
WHERE number_ratings > 1 
RETURN person.name, number_ratings;

Their ratings of restaurants I haven't rated

MATCH (james:Person {name:'James'})-[my:RATED]->
                        	(restaurant:Restaurant)<-[their:RATED]-(person:Person) 
WHERE their.rating >= (my.rating - 1) 
    AND their.rating <= (my.rating + 1) 
WITH COUNT(person) as number_ratings, person, james 
MATCH (person)-[rates:RATED]-(newRestaurant) 
WHERE number_ratings > 1 AND NOT (james-[:RATED]->newRestaurant) 
RETURN person, newRestaurant,rates.rating;

Which restaurant should we go to?

MATCH (james:Person {name:'James'})-[my:RATED]->
                        	(restaurant:Restaurant)<-[their:RATED]-(person:Person) 
WHERE their.rating >= (my.rating - 1)
    AND their.rating <= (my.rating + 1) 
WITH COUNT(person) as number_ratings, person, james 
MATCH (person)-[new_rating:RATED]-(newRestaurant) 
WHERE number_ratings > 1 AND NOT (james-[:RATED]->newRestaurant) 
WITH AVG(new_rating.rating) as avg_rating, 
    COUNT(new_rating) as c, newRestaurant 
WHERE c > 1 
RETURN c, avg_rating, newRestaurant.name
ORDER BY avg_rating DESC;

Need Directions?

MATCH p=(james:Person {name:'James'})-[my:RATED]->(restaurant)
	<-[their:RATED]-(person), 
	(start_location:Station {name:'Liverpool St'})
WHERE their.rating >= (my.rating - 1)
    AND their.rating <= (my.rating + 1) 
WITH COUNT(person) as number_ratings, person, james, start_location 
MATCH (person)-[new_rating:RATED]-(newRestaurant) 
WHERE number_ratings > 1 AND NOT (james-[:RATED]->newRestaurant) 
WITH AVG(new_rating.rating) as avg_rating, 
    COUNT(new_rating) as c, newRestaurant, start_location
MATCH route=(start_location)-[tube:ROUTE*0..6]->
    (location)<-[:LOCATED_AT]-(newRestaurant) 
WHERE c > 1
RETURN avg_rating, newRestaurant.name,
        EXTRACT(rd in nodes(route) | rd.name) as stations, 
        EXTRACT(rd in tube | rd.Line) as Lines, 
        REDUCE(distance = 0, rd in tube | distance + rd.Distance) as distance
ORDER BY distance ASC

Summary

  • Graphs are made up of nodes and relationships (paths)
  • Graph DB is optimised for traversing graphs
  • Speed and Simplicity over RDBMS
  • Use when relationships are important

Thank You

Resources

http://neo4j.org - free book

GraphGist

Neo4j London Meetup Group

https://neo4j.com/blog/analyzing-panama-papers-neo4j/

Tweet Me: @jrowlands

Slides: http://kiwiwebdeveloper.com/talks/

 

Image Sources
Line Chart: http://en.wikipedia.org/wiki/Graph_%28mathematics%29 Red Cross: https://pixabay.com/en/delete-remove-cross-red-cancel-156119/ Panama City: https://upload.wikimedia.org/wikipedia/commons/2/26/Panama_City-3.jpg Euler: https://commons.wikimedia.org/wiki/File:Leonhard_Euler.jpg Konigsburg: https://commons.wikimedia.org/wiki/File:Konigsberg_bridges.png