[Neo4j] Cypher Aggregation functions - specifically SUM()

n_aschbacher nathan at cellsixtyone.com
Wed Oct 26 01:56:23 CEST 2011


Hi,

Thanks for considering my input and getting back to me on this issue.  I'm
glad to hear that this kind of functionality is being though out, because
its addition to Cypher will significantly enhance its usefulness as a robust
graph traversal/query language.


Andres Taylor wrote:
> 
>    1. Cypher needs a way to turn an iterable of nodes to an iterable of of
>    numeric values. Something like Scala collection's map method. It could
> look
>    something like this: RETURN MAP(x in NODES(p) : x.votes)
> 

It's funny you use that notation, because I tried several different forms of
that after I saw its use in the predicate functions.  Except I'm not sure I
see how that notation deals with the aggregation issue.  It would seem to
still need an aggregation applied to the elements of the List collection
like:

RETURN SUM(MAP(x in NODES(p) : x.votes))

However, the alternative would be to allow nesting when the result set is
comprised of nodes.  As in, if you could use:

START n = node(START i = node(0) MATCH p = i--() RETURN NODES(p)) RETURN
SUM(n.votes)

It's cumbersome, but being able to nest query start conditions might be more
useful generally.  It provides the sort of explicit end-user distinction
between the ambiguity you highlight below by forcing them to choose.  The
two problems with this I see are you can't perform relationship property
aggregation the same way and it's almost certainly not as efficient as it
otherwise could be because you've broken up what should be two interleaved
processes (caching totals and aggregating at the next traversal step) into
two serially dependent ones.

The SUM(MAP(<Iterable> : _._)) expression makes a lot more sense. 


Andres Taylor wrote:
> 
>    2. Aggregate functions need to be able to work on iterables, and not
> just
>    on multiple subgraphs. The problem here is how to make it obvious which
> one
>    you are trying to use, e.g.
>    RETURN foo.bar, COUNT(NODES(path))
>    Does that mean aggregate on foo.bar and return the number of paths, or
>    does it mean that you want to know the number of nodes in path?
> 

I'm a little confused by what you mean.  Isn't there always only one path
returned per row when you provide an additional columns like "foo.bar"?  I
mean "RETURN foo.bar, path" will always produce one Node (or its property in
this case) and the one path traversed to reach that node per row.

So by default it would have to mean "return the number of nodes in path"
(for this return row).


Andres Taylor wrote:
> 
> If/when this is done, your query would look something like this:
> 
> RETURN SUM(MAP(x in NODES(path) : x.votes))
> 

I should read more carefully before I start typing.  :-)  Yes.  This makes
the most sense to me.





On Tue, Oct 25, 2011 at 12:02 PM, n_aschbacher &lt;nathan@&gt;wrote:

> Hi,
>
> You're correct in one sense.  If I remove the path, or other columns, from
> the RETURN statement then I can get a single SUM value back for all the
> properties in the entire tree below my starting node.
>
> My problem is that I want to return multiple rows, a row for each path
> through the graph, with the SUM of the properties of the nodes traversed
> so
> far on that single path.
>
> The idea is that I want to know which branch in the tree of posts and
> replies has the highest total vote count.
>
> Removing other columns from the RETURN statement as you suggest will only
> give me the SUM of votes for the whole tree, not per branch.
>
> Cheers!
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Cypher-Aggregation-functions-specifically-SUM-tp3450203p3450996.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> _______________________________________________
> Neo4j mailing list
> User at .neo4j
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User at .neo4j
https://lists.neo4j.org/mailman/listinfo/user



--
View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Cypher-Aggregation-functions-specifically-SUM-tp3450203p3453162.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.


More information about the User mailing list