[Neo] neo4j for the sysadmin?

Zach White zach at box.net
Tue Apr 27 08:38:56 CEST 2010

Hi Everyone,

I would find it very helpful if there was some documentation targetted at sysadmins. Something that gave us a brief overview of what neo4j is (keeping in mind that most us have not done any java programming, even if we have experience working with and deploying java apps) and gives us hints on finding the information that we're looking for. I would say there's two basic (but very different) needs that the average admin is trying to fill:

1. Download and install the software for a user that asks for it

I've spent the last 2 hours poking around the site and the wiki, but until I found "Getting Started REST" I had no inkling that there was a standalone server that could be setup. A user who has asked their sysadmin to "install neo4j" is most likely to want the RESTful server to query against, since they would otherwise just download the class files themselves.

2. Learn about neo4j prior to a deployment so they can support an engineering dept that has decided to use neo4j, or so they can sign off on a decision to use neo4j.

It is this need that I am trying to fill. 

I spent some time looking over the site, but was only successful in finding bits and pieces of what I wanted. I finally asked on irc, and was pointed to the Performance Guide and Configuration Settings page (thanks thobe,) which helped too, but I'm still feeling a bit lost.

Here's a list of the types of questions I'm trying to answer (presumably about the REST API):

* Is there a preferred way to package this, or am I rolling my own RPM/debs?
* Can I make the app fit into the FHS, or is running it in a self-contained directory my only reasonable option? (Reasonable means I don't have to patch the source code.)
* How do I scale this up? How much RAM will it take before CPU or Disk I/O are my main bottlenecks? Do I ever have to worry about CPU, or will I run out of RAM and Disk I/O long before I could think about using all the processing power in a modern multi-core hyperthreaded processor?
* Can I change configuration options without doing a full restart?
* What are the replication options? Do any of them handle having the databases in separate geographical areas (say 80-250ms from each other?) What about periods where connectivity is broken, how does neo4j handle that situation?
* How do I upgrade neo4j without downtime for end users? (This implies working master/master replication or the ability to promote a slave gracefully.)
* What happens if the app dies in a non-clean fashion? (kill -9, OOM, power lost, SAN catches fire, whatever) 

That covers the "sysadmin" side of the equation, but treats neo4j like a black box. Most sysadmins will want to interact with the database a little, since they will likely be asked to look things up or make minor modifications. The REST API documentation is clear enough to me, but I am not the average sysadmin. Most of the admins I work with need something a little more howto and a little less reference.

A short howto-style narrative explaining how to use curl or poster to query the root node, follow links, and view properties would go a long towards giving sysadmins some basic visibility into the DB. Following that up with some simple examples showing how to pull some basic stats (number of nodes, number of relationships, size of db, or whatever is exposed) would cover 80% of what most sysadmins need to get out of the database.

Finally, in an attempt to bring this too-long email to a close, allow me to doff my Professional SysAdmin hat and put on my hobbyist programmer fedora.

I've actually briefly explored using neo4j in a personal project I was doing, but did not spend long considering it. At the time there was no AJAX component and I brushed it off with, "just another java database, man I hate java." (Hey, I said hobbiest, not professional. ;)

Now that I know about the existence of the AJAX component and the python module, and have spent a little time looking, the project interests me a lot more. I could see myself using neo4j as the backend for some of my toy projects. The same information that you would be providing for sysadmins, based on what I wrote above, would also be helpful in educating the hypothetical hobbiest who doesn't know about neo4j yet.

Perhaps what's needed to assist both groups of users is a FAQ that contains some of this information and is easier to find. (To find the FAQ I first looked on the front page of neo4j.org, went to "Documentation," tried the wiki because nothing else looked right, and finally found the FAQ buried in the middle of the page.)

Thanks for reading and considering my suggestion.


More information about the User mailing list