[Neo] Sharding

Emil Eifrem emil at neotechnology.com
Tue Jan 26 20:53:47 CET 2010


On Tue, Jan 26, 2010 at 20:32, Rick Whitesel (rwhitese)
<rwhitese at cisco.com> wrote:
> Hi:
>
> Potentially stupid questions follow: In looking at how to add sharding
> to Neo4j, I was wondering if it made any sense to put Neo4j on top of
> Cassandra or maybe a distributed BTree+ system? I love the relationship
> modeling in Neo4j but I need the scalability of sharding; preferable not
> done at the client.

Hi Rick --

Worry not, that's not a stupid question at all. The problem with just
putting the Neo4j API on top of something like Cassandra is that it
doesn't really solve the problem. The challenge with auto-sharding a
graph isn't the engineering of writing a distributed system. It's the
science of efficiently partitioning a dynamic graph.

Cassandra shards everything by a defined key. That will lead to an
inefficienct sharding scheme if you have a graph-like connected data
structure that you want to be able to traverse in an ad-hoc manner.

Do you know any invariants about the domain, like "entity of type X
will NEVER be connected to entity of type Y"?

Cheers,

-- 
Emil Eifrém, CEO [emil at neotechnology.com]
Neo Technology, www.neotechnology.com
Cell: +46 733 462 271 | US: 206 403 8808
http://blogs.neotechnology.com/emil
http://twitter.com/emileifrem


More information about the User mailing list