[Neo] How to efficiently query in Neo4J?

Rick Bullotta rick.bullotta at burningskysoftware.com
Wed Apr 7 17:24:54 CEST 2010


Here's a rough description of how we're handling something similar:

- Create a "tag query collection" consisting of the node(s) for each tag in
your query (TQNodes)

- Given the tags to match, determine the "specificity" of each tag (e.g.
depth in the tag hierarchy)

- For the most specific tag, using relationships, traverse applicable
nodes/posts to first reduce the valid set of nodes to a smaller number.
We'll call this collection PostNodes

- For each node/post (PostNode) in the returned list (PostNodes), get a list
of tag relationships and grab the node Id's for these tags and put them in a
"tag post collection" (TPNodes)

- For each tag node (TPNode) in TPNodes, check for a match in the your "tag
query collection" (TQNodes).  If TPNode has no matches, get the parent of
TPNode and re-check if it is in the TQNodes.  Continue until you reach the
root node in the tag hierarchy or find a match.  If you reach the root node
in the tag hierarchy, you do not have a match, so try the next PostNode in
PostNodes.  If you find a match for TPNode, try the next TPNode in TPNodes.
If all the nodes in TPNodes have a match, include this PostNode in your
results

I'd love to know if there's a more effective way to do this as well!

Rick

-----Original Message-----
From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org] On
Behalf Of Alastair James
Sent: Wednesday, April 07, 2010 10:54 AM
To: Neo user discussions
Subject: [Neo] How to efficiently query in Neo4J?

Hi there...

I am looking at moving a website to a model based on Neo4J, however, I am
having trouble seeing how to optimise the 'main query' type for Neo4J.

Briefly, the site consists of posts, each tagged with various attributes,
e.g. (its a travel site) location, theme, cost etc... Also the tags
are hierarchical. So, for location we have (say) 'tuscany' inside 'italy'
inside 'europe'. For theme we have (say) 'cycling' inside 'activity'.

I can beautifully model the parent child relationships of these 'tags' using
a graph Db as objects with a 'CHILD_OF' relationship type. Then the posts go
in and have related 'tags' related to them with a 'TAGGED_WITH' relationship
types. Fine.

However, here is the complex bit (well to me): I need to be able to find all
posts tagged with a set of tags (AND operation so each post must be tagged
with each tag).

Whats more, these queries need to respect the parent relationships in the
tags. So if I search for 'activity' in 'france' it needs to traverse the
CHILD_OF relationships to find things tagged with any child of 'activity'
AND any child of 'France'.

It seems pretty easy to write a traversal class that would find all posts in
any child of ether 'tag' node, simply follow 'CHILD_OF' and 'TAGGED_WITH'
backwards and return nodes of type post.

However how to do the AND bit? The only way I can see is to return both
lists and union them in user code... however that seems inelegant and may
not scale brilliantly.

Any ideas how to optimise?

Al
_______________________________________________
Neo mailing list
User at lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user



More information about the User mailing list