[Neo] indexing relationships

Tobias Ivarsson tobias.ivarsson at neotechnology.com
Tue Jul 7 10:48:38 CEST 2009


On Tue, Jul 7, 2009 at 10:12 AM, Symeon (Akis) Papadopoulos
<papadop at iti.gr>wrote:

> Perhaps the lookup method should have an additional argument specifying
> whether the relationship is directed or not.


Good point, but I think it's better to add a specific method for that kind
of lookup:

interface RelationshipIndex {
    void index(Relationship relationship);
    Relationship directionalLookup(RelationshipType type, Node start, Node
end);
    Relationship undirectionalLookup(RelationshipType type, Node n1, Node
n2);
    void remove(Relationship relationship);
}

This is probably a nice convenience method even if the implementation would
just be:

Relationship undirectionalLookup(RelationshipType type, Node n1, Node n2) {
    Relationship result = this.directionalLookup(type, n1, n2);
    if (result == null) {
        result = this.directionalLookup(type, n2, n1);
    }
    return result;
}

Depending on the actual index implementation, this implementation of
undirectionalLookup could actually be the best performing implementation.
But that's enough getting ahead of ourselves, having it in the interface is
a good idea. Now all that is left is for this to get high enough in the
priority queue and for someone to have enough time for this to get
implemented...


>
> My major problem (stemming from some preliminary tests I've done) is
> performance and scalability. I've implemented a simple indexing scheme,
> where an edge between two nodes with string indices n1_idx and n2_idx is
> indexed simply by the concatenation between them (the underlying index
> structure was a B-tree). When I attempted to index all edges of a
> moderately sized graph (~120k nodes, ~2M edges) the indexing process
> took quite a lot (several hours) which I consider as unacceptable. Any
> suggestions for improving on the efficiency and scalability of such an
> index are more than welcome.
>

Sorry, I don't have a silver bullet solution for this. Anyone else? (see for
example Mattias email that arrived while I was typing this)

Johan: Is there perhaps a batch index creator?
Akis: This (if available) would help you if most of your data is created up
front.

I remember being particularly irritated by this problem during my CS
education:
"given n in N and m in N, and finding out if (n,m) in E is an O(1) operation
you said? - No way that is true for a large graph."

Cheers,
-- 
Tobias Ivarsson <tobias.ivarsson at neotechnology.com>
Hacker, Neo Technology
www.neotechnology.com
Cellphone: +46 706 534857


More information about the User mailing list