[Neo] How to efficiently query in Neo4J?
rick.bullotta at burningskysoftware.com
Wed Apr 7 17:24:54 CEST 2010
Here's a rough description of how we're handling something similar:
- Create a "tag query collection" consisting of the node(s) for each tag in
your query (TQNodes)
- Given the tags to match, determine the "specificity" of each tag (e.g.
depth in the tag hierarchy)
- For the most specific tag, using relationships, traverse applicable
nodes/posts to first reduce the valid set of nodes to a smaller number.
We'll call this collection PostNodes
- For each node/post (PostNode) in the returned list (PostNodes), get a list
of tag relationships and grab the node Id's for these tags and put them in a
"tag post collection" (TPNodes)
- For each tag node (TPNode) in TPNodes, check for a match in the your "tag
query collection" (TQNodes). If TPNode has no matches, get the parent of
TPNode and re-check if it is in the TQNodes. Continue until you reach the
root node in the tag hierarchy or find a match. If you reach the root node
in the tag hierarchy, you do not have a match, so try the next PostNode in
PostNodes. If you find a match for TPNode, try the next TPNode in TPNodes.
If all the nodes in TPNodes have a match, include this PostNode in your
I'd love to know if there's a more effective way to do this as well!
From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org] On
Behalf Of Alastair James
Sent: Wednesday, April 07, 2010 10:54 AM
To: Neo user discussions
Subject: [Neo] How to efficiently query in Neo4J?
I am looking at moving a website to a model based on Neo4J, however, I am
having trouble seeing how to optimise the 'main query' type for Neo4J.
Briefly, the site consists of posts, each tagged with various attributes,
e.g. (its a travel site) location, theme, cost etc... Also the tags
are hierarchical. So, for location we have (say) 'tuscany' inside 'italy'
inside 'europe'. For theme we have (say) 'cycling' inside 'activity'.
I can beautifully model the parent child relationships of these 'tags' using
a graph Db as objects with a 'CHILD_OF' relationship type. Then the posts go
in and have related 'tags' related to them with a 'TAGGED_WITH' relationship
However, here is the complex bit (well to me): I need to be able to find all
posts tagged with a set of tags (AND operation so each post must be tagged
with each tag).
Whats more, these queries need to respect the parent relationships in the
tags. So if I search for 'activity' in 'france' it needs to traverse the
CHILD_OF relationships to find things tagged with any child of 'activity'
AND any child of 'France'.
It seems pretty easy to write a traversal class that would find all posts in
any child of ether 'tag' node, simply follow 'CHILD_OF' and 'TAGGED_WITH'
backwards and return nodes of type post.
However how to do the AND bit? The only way I can see is to return both
lists and union them in user code... however that seems inelegant and may
not scale brilliantly.
Any ideas how to optimise?
Neo mailing list
User at lists.neo4j.org
More information about the User