[Neo] How to efficiently query in Neo4J?

Alastair James al.james at gmail.com
Wed Apr 7 18:19:36 CEST 2010


Hmmmm...

I am guessing the most efficient way might be to have a two stage return
evaluator.

E.g. The custom return evaluator class has a hash table of <node id> =>
<count> pairs. Each time 'isReturnableNode' is called, it increments the
count for that node id in the hash. If count >= <total number of tags to
check for> return the node.

Thus, if you run a traversal starting from each tag using the same instance
of the custom return evaluator, none of the traversals will emit any nodes
(good for memory!) until the last one, where the ones matching all the tags
will be emitted.

Does that sound like it would work?

Al


On 7 April 2010 16:20, Max De Marzi Jr. <maxdemarzi at gmail.com> wrote:

> I've had similar issues and they way I've done it (which may not be the
> right way) is to run the first traversal and store the returned nodes.
>  Then
> run the second traversal and return only if it is contained in the set of
> returned nodes in the first traversal.
>
> The traverses hit each node only once, and since we want to return only if
> they are found twice, I don't think there is a clean way to do it in a
> single traversal.
>
> On Wed, Apr 7, 2010 at 9:53 AM, Alastair James <al.james at gmail.com> wrote:
>
> > Hi there...
> >
> > I am looking at moving a website to a model based on Neo4J, however, I am
> > having trouble seeing how to optimise the 'main query' type for Neo4J.
> >
> > Briefly, the site consists of posts, each tagged with various attributes,
> > e.g. (its a travel site) location, theme, cost etc... Also the tags
> > are hierarchical. So, for location we have (say) 'tuscany' inside 'italy'
> > inside 'europe'. For theme we have (say) 'cycling' inside 'activity'.
> >
> > I can beautifully model the parent child relationships of these 'tags'
> > using
> > a graph Db as objects with a 'CHILD_OF' relationship type. Then the posts
> > go
> > in and have related 'tags' related to them with a 'TAGGED_WITH'
> > relationship
> > types. Fine.
> >
> > However, here is the complex bit (well to me): I need to be able to find
> > all
> > posts tagged with a set of tags (AND operation so each post must be
> tagged
> > with each tag).
> >
> > Whats more, these queries need to respect the parent relationships in the
> > tags. So if I search for 'activity' in 'france' it needs to traverse the
> > CHILD_OF relationships to find things tagged with any child of 'activity'
> > AND any child of 'France'.
> >
> > It seems pretty easy to write a traversal class that would find all posts
> > in
> > any child of ether 'tag' node, simply follow 'CHILD_OF' and 'TAGGED_WITH'
> > backwards and return nodes of type post.
> >
> > However how to do the AND bit? The only way I can see is to return both
> > lists and union them in user code... however that seems inelegant and may
> > not scale brilliantly.
> >
> > Any ideas how to optimise?
> >
> > Al
> > _______________________________________________
> > Neo mailing list
> > User at lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> _______________________________________________
> Neo mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Dr Alastair James
CTO James Publishing Ltd.
http://www.linkedin.com/pub/3/914/163

www.worldreviewer.com

WINNER Travolution Awards Best Travel Information Website 2009
WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008
WINNER Travolution Awards Best New Online Travel Company 2008
WINNER Travel Weekly Magellan Award 2008
WINNER Yahoo! Finds of the Year 2007

"Noli nothis permittere te terere!"


More information about the User mailing list