[Neo] indexing relationships

Mattias Persson mattias at neotechnology.com
Tue Jul 7 10:39:24 CEST 2009


2009/7/7 Symeon (Akis) Papadopoulos <papadop at iti.gr>:
>
>> Was that what you meant?
>>
>> My interpretation of Symeons request was an index from
>> (RelationshipType,Node,Node) to Relationship, which in my opinion would be
>> much more useful than a simple index from (String,primitive) to
>> Relationship, which is how the node indexes work.
>>
> That was exactly what I meant.
>
>
>> The use for such an index would be for places where you have many
>> relationships from a node and quickly wish to determine if there is a
>> relationship from that node to a given node.
>>
> Actually, my original thought was to index all relationships of a graph,
> but selective indexing based on node degree sounds like a smart way to
> save disk space and index update time.
>
>> The benefit of having this be part of index-util (as opposed to internally
>> in neo-core) is obvious. Keeping an index like this updated is expensive.
>> You as an application developer know where in your graph you need it and
>> thus where you are prepared to pay the extra overhead for insertion (an
>> overhead that pays back on lookup). If we instead were to add something like
>> this to neo-core we would add that overhead to all relationship creations,
>> and that would be very undesirable.
>>
> That is obviously understandable.
>>
>> To further clarify the distinction between an index like this and the
>> currently existing indexes, here is a quick definition of the interface for
>> it:
>>
>> interface RelationshipIndex {
>>     void index(Relationship relationship);
>>     Relationship lookup(RelationshipType type, Node start, Node end);
>>     void remove(Relationship relationship);
>> }
>>
> Perhaps the lookup method should have an additional argument specifying
> whether the relationship is directed or not.
>
>
> My major problem (stemming from some preliminary tests I've done) is
> performance and scalability. I've implemented a simple indexing scheme,
> where an edge between two nodes with string indices n1_idx and n2_idx is
> indexed simply by the concatenation between them (the underlying index
> structure was a B-tree). When I attempted to index all edges of a
> moderately sized graph (~120k nodes, ~2M edges) the indexing process
> took quite a lot (several hours) which I consider as unacceptable. Any
> suggestions for improving on the efficiency and scalability of such an
> index are more than welcome.

Grouping more operations in one transaction will increase indexing
speed and also if we were to write such a RelationshipIndex in lucene
it would be much faster, at least if tx grouping is high.

>
> Best regards,
> Akis
>
>> Cheers,
>> Tobias
>>
>> On Mon, Jul 6, 2009 at 7:27 PM, Peter Neubauer <neubauer.peter at gmail.com>wrote:
>>
>>
>>> Hi Symeon,
>>> so, what you are saying is that you would like to have the possibility
>>> to set indexes on relationships and their properties just like on the
>>> nodes as in http://components.neo4j.org/index-util/ ?
>>>
>>> I guess that would be easy to do, or you could do it yourself by
>>> looking at the index-util package, but I guess being able to treat
>>> Relationships as first-class indexing citizens is a good idea ... any
>>> others that have opinions on that?
>>>
>>> /peter
>>>
>>> GTalk:      neubauer.peter
>>> Skype       peter.neubauer
>>> Phone       +46 704 106975
>>> LinkedIn   http://www.linkedin.com/in/neubauer
>>> Twitter      http://twitter.com/peterneubauer
>>>
>>> http://www.neo4j.org     - New Energy for Data - The Graph Database.
>>> http://www.ops4j.org     - New Energy for OSS Communities - Open
>>> Participation Software.
>>> http://www.oredev.org   - Where Good Geeks Grok.
>>>
>>>
>>>
>>> On Mon, Jul 6, 2009 at 10:38 AM, Symeon (Akis)
>>> Papadopoulos<papadop at iti.gr> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have been experimenting with incremental building of large graphs and
>>>> I realized that the graph update process would be much faster if there
>>>> was a possibility to index relationships between pairs of nodes in order
>>>> to have the possibility for quick lookups. Currently, each time I want
>>>> to create or update a relationship between two nodes (say n1 and n2), I
>>>> first need to iterate through all relationships (of the particular type)
>>>> of n1 and check whether the other node is n2. Is there any effort within
>>>> neo4j.util.index to provide such functionality or should I try to
>>>> develop something on my own?
>>>>
>>>> Best regards,
>>>> Symeon (Akis)
>>>> _______________________________________________
>>>> Neo mailing list
>>>> User at lists.neo4j.org
>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>
>>>>
>>> _______________________________________________
>>> Neo mailing list
>>> User at lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>>>
>>>
>>
>>
>>
>>
>
> _______________________________________________
> Neo mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [mattias at neotechnology.com]
Neo Technology, www.neotechnology.com


More information about the User mailing list