[Neo] Sharding

Rick Whitesel (rwhitese) rwhitese at cisco.com
Tue Jan 26 21:20:01 CET 2010


The corner stone of this data will be an identity to which we will want to associate information about that identity. The problem is sometimes we want to see what is related to the identity and other times we want to see what identities are related to some identity attribute, kind of LinkedInish.


-----Original Message-----
From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org] On Behalf Of Emil Eifrem
Sent: Tuesday, January 26, 2010 2:54 PM
To: Neo user discussions
Subject: Re: [Neo] Sharding

On Tue, Jan 26, 2010 at 20:32, Rick Whitesel (rwhitese)
<rwhitese at cisco.com> wrote:
> Hi:
> Potentially stupid questions follow: In looking at how to add sharding
> to Neo4j, I was wondering if it made any sense to put Neo4j on top of
> Cassandra or maybe a distributed BTree+ system? I love the relationship
> modeling in Neo4j but I need the scalability of sharding; preferable not
> done at the client.

Hi Rick --

Worry not, that's not a stupid question at all. The problem with just
putting the Neo4j API on top of something like Cassandra is that it
doesn't really solve the problem. The challenge with auto-sharding a
graph isn't the engineering of writing a distributed system. It's the
science of efficiently partitioning a dynamic graph.

Cassandra shards everything by a defined key. That will lead to an
inefficienct sharding scheme if you have a graph-like connected data
structure that you want to be able to traverse in an ad-hoc manner.

Do you know any invariants about the domain, like "entity of type X
will NEVER be connected to entity of type Y"?


Emil Eifrém, CEO [emil at neotechnology.com]
Neo Technology, www.neotechnology.com
Cell: +46 733 462 271 | US: 206 403 8808
Neo mailing list
User at lists.neo4j.org

More information about the User mailing list