running-neo4j-in-docker-with-the-graph-data-science-library

Running Neo4j in Docker with the Graph Data Science library

The Neo4j Graph Data Science (GDS) library is a graph-native machine learning extension for the Neo4j graph database. By implementing a range of graph algorithms, this plugin makes it possible to harness the predictive power of relationships and network structures that exist in graph data.

The graph algorithms included with GDS include the implementations of community detection, centrality, similarity, link prediction, path finding, and node embedding algorithms. All these algorithms enable reasoning on graphs and play an essential role in many advanced analytics workflows.

Running Neo4j in Docker

The official Neo4j image available on Docker Hub provides a standard, ready-to-run configuration of Neo4j. This image can be used to start a Neo4j instance using the following command:

docker run \
  -p 7474:7474 \
  -p 7687:7687 \
  -v <host data directory>:/data \
  neo4j

The above command bind-mounts <host data directory> to the /data directory inside the container to allow the database files to be persisted outside the container.

Enabling the Graph Data Science library in the Neo4j Docker container

Technically speaking, the Graph Data Science library is a plugin for the Neo4j graph database. As with all Neo4j plugins, the GDS library needs to be installed into the database before it can be used.

An easy way to achieve this for the Docker-based Neo4j installation is to download the Graph Data Science library from the Neo4j Download Center into a directory on the host and bind-mount that directory to the /plugins directory inside the Neo4j container. Additionally, two dbms.security.procedures configuration entries need to be set in order to allow the GDS library to access low-level components of Neo4j:

docker run \
  -p 7474:7474 \
  -p 7687:7687 \
  -v <host data directory>:/data \
  -v <host plugins directory>:/plugins \
  -e NEO4J_dbms_security_procedures_unrestricted=gds.* \
  -e NEO4J_dbms_security_procedures_whitelist=gds.* \
  neo4j

Made by Anton Vasetenkov.

If you want to say hi, you can reach me on LinkedIn or via email. If you like my work, you can support me by buying me a coffee.