[Neo4j] Lucene Index on Relationships

Rick Bullotta rick.bullotta at burningskysoftware.com
Mon Jun 21 16:35:05 CEST 2010


I think the combination of relationship type + relevant property value(s) is
a more appropriate context for an index, as opposed to for "all
relationships in the graph".

FWIW, we achieve this today with Neo directly using the concept of "bucket"
nodes.  Instead of having to create different relationship types for each
range of values, as Craig has suggested, we achieve a similar result by a
set of intermediate nodes that all have a relationship to a "bucket
collection" node, and individual nodes are attached via a common
relationship type to the appropriate "bucket" based on one or more values in
the node.

This gives us a fairly fast way to reduce the # of nodes quite quickly,
without the need for an external index.

Just a thought.


-----Original Message-----
From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org] On
Behalf Of Mattias Persson
Sent: Monday, June 21, 2010 8:36 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Lucene Index on Relationships

Hi,

how do you guys expect indexing for relationships to work? Would it be
an index just as for nodes... or per node? I often hear that it'd
speed up traversals if a node has many, many neighbours. But if the
relationship index would be for the entire graph (not per node) that
wouldn't really help, would it?

2010/6/21 Craig Taverner <craig at amanzi.com>:
> A side comment, since I think indexing relationships with lucene might be
> good, but think there might be alternatives for your current example.
>
> You said that the relationship property is a float from 0 to 1, so you
> cannot use relationship types, but actually, when you consider that any
> index is usually created by breaking data ranges (continuous or discrete)
> into fewer, more discrete ranges, you can use a relationship type to
> represent a range of floats. For example, if you have roughly even
> distribution of floats between 0 and 1, try divide that into 100 parts
> (0%-100%, or 0.01 to 1.00), and make a relationship type for each. This
> would certainly facilitate traversing relationships of specific float
values
> (at least improve the performance dramatically, as in an index).
>
> Of course, this example focuses on traversing from a particular document.
If
> you are searching for all relationships in the entire database with
> particular float values, then a separate index would be better.
>
> On Mon, Jun 21, 2010 at 2:11 PM, Marius Kubatz
<marius.kubatz at udo.edu>wrote:
>
>> Hello guys, hello community!
>>
>> I'm currently evaluating neo4j for my thesis and have a wish :)
>> I have already opened a ticket for this,(
>> https://trac.neo4j.org/ticket/241 ) but
>> I would like to hear what you guys think about it.
>>
>> Basically it just involves the ability to index Neo4j Relationships with
>> Lucene Index.
>>
>> Neo4j works great on sparse graphs, but what happens when you have a very
>> tight graph with several thousands of neighbors to one node?
>> Additionally as soon as you store informations on Relationships you will
>> get
>> into trouble, because you will have to iterate through all those edges to
>> find the
>> properties you seek.
>>
>> If this sounds far fetched please take a look at this example where one
>> might need properties on Relationships:
>> One "Document" node is related to another Document node by a similarity
>> function which is stored in the Relationships between those document
nodes.
>> Lets just say that we save a float between [0 - 1] on those
relationships,
>> which makes it impossible to create RelationshipTypes for every value.
>>
>> Using Index to fetch Relationships by their indexed properties would
>> greatly
>> speed up the process and increase the attractiveness of using properties
on
>> Relationships. I would love to have quick access to Relationship
properties
>> where I could add and implement fuzzy
>> logic, probabilities, Bayesian networks, similarities, ranking ... and so
>> on
>> ... As said thank you for Relationship properties, they are great and
>> already there, but what I miss is quick access to them.
>>
>> Thank you very much and best regards!
>>
>> Marius
>>
>> --
>> "Programs must be written for people to read, and only incidentally for
>> machines to execute."
>>
>> - Abelson & Sussman, SICP, preface to the first edition
>> _______________________________________________
>> Neo4j mailing list
>> User at lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [mattias at neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
_______________________________________________
Neo4j mailing list
User at lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user



More information about the User mailing list