[Neo4j] Lucene Index on Relationships

Craig Taverner craig at amanzi.com
Mon Jun 21 17:49:18 CEST 2010


You got me again, Rick.

I have not (yet) used my idea of ranged relationship types, and still use
"buckets", or intermediate nodes (all over the place!). However, I am
thinking of using a combination of the two approaches for my "composite"
index. I have deviated from the classic binary tree because the total number
of nodes created is unnecessarily high for an index (don't want the index to
exceed the original data in size). Making fewer buckets, and compensating
using relationship types, leads to a better balance (IMHO).

On Mon, Jun 21, 2010 at 4:35 PM, Rick Bullotta <
rick.bullotta at burningskysoftware.com> wrote:

> I think the combination of relationship type + relevant property value(s)
> is
> a more appropriate context for an index, as opposed to for "all
> relationships in the graph".
>
> FWIW, we achieve this today with Neo directly using the concept of "bucket"
> nodes.  Instead of having to create different relationship types for each
> range of values, as Craig has suggested, we achieve a similar result by a
> set of intermediate nodes that all have a relationship to a "bucket
> collection" node, and individual nodes are attached via a common
> relationship type to the appropriate "bucket" based on one or more values
> in
> the node.
>
> This gives us a fairly fast way to reduce the # of nodes quite quickly,
> without the need for an external index.
>
> Just a thought.
>
>
> -----Original Message-----
> From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org]
> On
> Behalf Of Mattias Persson
> Sent: Monday, June 21, 2010 8:36 AM
> To: Neo4j user discussions
> Subject: Re: [Neo4j] Lucene Index on Relationships
>
> Hi,
>
> how do you guys expect indexing for relationships to work? Would it be
> an index just as for nodes... or per node? I often hear that it'd
> speed up traversals if a node has many, many neighbours. But if the
> relationship index would be for the entire graph (not per node) that
> wouldn't really help, would it?
>
> 2010/6/21 Craig Taverner <craig at amanzi.com>:
> > A side comment, since I think indexing relationships with lucene might be
> > good, but think there might be alternatives for your current example.
> >
> > You said that the relationship property is a float from 0 to 1, so you
> > cannot use relationship types, but actually, when you consider that any
> > index is usually created by breaking data ranges (continuous or discrete)
> > into fewer, more discrete ranges, you can use a relationship type to
> > represent a range of floats. For example, if you have roughly even
> > distribution of floats between 0 and 1, try divide that into 100 parts
> > (0%-100%, or 0.01 to 1.00), and make a relationship type for each. This
> > would certainly facilitate traversing relationships of specific float
> values
> > (at least improve the performance dramatically, as in an index).
> >
> > Of course, this example focuses on traversing from a particular document.
> If
> > you are searching for all relationships in the entire database with
> > particular float values, then a separate index would be better.
> >
> > On Mon, Jun 21, 2010 at 2:11 PM, Marius Kubatz
> <marius.kubatz at udo.edu>wrote:
> >
> >> Hello guys, hello community!
> >>
> >> I'm currently evaluating neo4j for my thesis and have a wish :)
> >> I have already opened a ticket for this,(
> >> https://trac.neo4j.org/ticket/241 ) but
> >> I would like to hear what you guys think about it.
> >>
> >> Basically it just involves the ability to index Neo4j Relationships with
> >> Lucene Index.
> >>
> >> Neo4j works great on sparse graphs, but what happens when you have a
> very
> >> tight graph with several thousands of neighbors to one node?
> >> Additionally as soon as you store informations on Relationships you will
> >> get
> >> into trouble, because you will have to iterate through all those edges
> to
> >> find the
> >> properties you seek.
> >>
> >> If this sounds far fetched please take a look at this example where one
> >> might need properties on Relationships:
> >> One "Document" node is related to another Document node by a similarity
> >> function which is stored in the Relationships between those document
> nodes.
> >> Lets just say that we save a float between [0 - 1] on those
> relationships,
> >> which makes it impossible to create RelationshipTypes for every value.
> >>
> >> Using Index to fetch Relationships by their indexed properties would
> >> greatly
> >> speed up the process and increase the attractiveness of using properties
> on
> >> Relationships. I would love to have quick access to Relationship
> properties
> >> where I could add and implement fuzzy
> >> logic, probabilities, Bayesian networks, similarities, ranking ... and
> so
> >> on
> >> ... As said thank you for Relationship properties, they are great and
> >> already there, but what I miss is quick access to them.
> >>
> >> Thank you very much and best regards!
> >>
> >> Marius
> >>
> >> --
> >> "Programs must be written for people to read, and only incidentally for
> >> machines to execute."
> >>
> >> - Abelson & Sussman, SICP, preface to the first edition
> >> _______________________________________________
> >> Neo4j mailing list
> >> User at lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> >>
> > _______________________________________________
> > Neo4j mailing list
> > User at lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
>
> --
> Mattias Persson, [mattias at neotechnology.com]
> Hacker, Neo Technology
> www.neotechnology.com
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>


More information about the User mailing list