[Neo4j] Connecting 2 separate graphs
Marko Rodriguez
okrammarko at gmail.com
Thu Mar 10 18:09:37 CET 2011
Hi Hemant,
> I thought Rexter project under tinkerpop would be a good home for things like this since it is using blueprints and any graph engine that supports blueprints should be able to connect with another blueprint enabled graph (both neo4j in my case). Another possibility is if both graphs were indexed for the common node types then they can be referenced independent of the graph they come from. Any other ideas/suggestions/comments?
Rexster allows you to wrap any Blueprints-enabled graph (thus, graph database, RDF store) and expose it over HTTP as a RESTful service using Grizzly standalone webserver or through Tomcat. Rexster supports any number of graphs exposed through the same service. E.g.
http://localhost/graph1/
http://localhost/graph2/
For your problem, you want to expose two different graphs, but I assume you want them on different machines. If this is the case, I wouldn't worry so much about which web wrapper to use as to your data architecture.
Your model seems to fall into place with the Web of Data (Linked Data) paradigm. http://linkeddata.org/ ... With this model, you connect isolated graphs by using an URI scheme for the unique identifiers of your elements. Thus, the ids of your vertices denotes their physical location. As such, you can merge two graphs (and traverse over two graphs) where the only overlap of data is the point of merger.
Graph1 machine: http://localhost/graph1/vertex/1 --- knows ---> http://removehost/graph2/vertex/2
Graph2 machine: http://removehost/graph2/vertex/2 --- worksFor --> http://remotehost/graph2/vertex/3
So forth and so on. There is much on the Web of Data so you could steal some design choices from them. Here is a short paper that is relatively easy to consume and explains the basics of the Web of Data architecture:
http://arxiv.org/abs/0908.0373
If you find your mind attracted to this idea, then you might want to move more into RDF as you can use Blueprints Sail [ https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation ] and Gremlin to do stuff like this:
https://github.com/tinkerpop/gremlin/wiki/LinkedData-Sail
Thus, search your Web of Data as if its a single repository --- a single graph.
Finally, the reason I say RDF is because you may find it cumbersome to deal with indices in graph databases using this distributed data model. In RDF, there is no such thing as an index as everything is directly addressable by its URI (even literals). In the end, its up to you... feel free to ask more questions.
Good luck,
Marko.
http://markorodriguez.com
http://tinkerpop.com
More information about the User
mailing list