[Neo4j] graph-matching from web application

David Montag david.montag at neotechnology.com
Wed Jul 21 22:41:19 CEST 2010

Hi Jonathan,

On Wed, Jul 21, 2010 at 10:45 AM, Jonathan Marten <Gurkensalat at gmx.de>wrote:

> Hi David,
> thanks a lot for your answers! They were very helpful.

Happy to be of service.

> >I'm not sure I understand your setup. Could you describe a, b and c in
> more
> >detail? What do you mean by a subgraph in this case? What makes it a
> >subgraph, i.e. what is the greater graph?
> My setup is similar to someone looking for a chemical molecule. My database
> would hold the structure of 200 million molecules and the user wants get
> information on every molecule that contains a certain structure. He can
> construct this structure using a html form and then we search in neo4j for
> all matching molecules (i.e. all that contain a CH2-CH=O or whatever people
> can think of). Neo4j returns the IDs of these molecules and then we use our
> existing Perl/PHP Scripts to retrieve more information from the relational
> database, visualize, and so on.

Interesting. Is it open source?

> >Well, this depends on how you roll it. If you have a separate database,
> then
> >you will have to access it via e.g. REST or using the remote graph db API.
> >But you can also have it embedded in your application, running in the
> >webapp. But you might not be using a Java webapp?
> Right, I'm not using a Java webapp. So my solution will probably be to
> implement a simple multi-threaded server in Java (for instance like the one
> at the end of this page:
> http://download.oracle.com/docs/cd/E17409_01/javase/tutorial/networking/sockets/clientServer.html)
> and then query that server from CGI-scripts on the webserver running the web
> application.

Will that server then expose highly specific operations for your domain, and
do all the processing locally before return the result?

> >I don't understand what you mean. Please clarify. For example, you can't
> >attach properties to the graph. Only to nodes and relationships.
> That was exactly what I meant. I wanted to store the ID only once per
> molecule and not on every node.

You could keep a relationship from each node that belongs to a molecule to a
reference node for that molecule, e.g. node[element="sodium"] --PART_OF-->
node[id=123, name="salt molecule"], or something like that.

> >But to answer your questions: I think you always need to do matching
> >starting from a node. You can match subgraphs with properties using the
> >addPropertyConstraint method on PatternNode and PatternRelationship. You
> can
> >match relationships too using the PatternRelationship class.
> Thanks a lot, I thought it would be like that from reading the
> documentation, but I wasn't sure. What I wanted to do is sort of "abstract
> matching", i.e. I wanted to retrieve a structure like N--rel1-->N--rel2-->N
> anywhere in the database without knowing anything about the nodes and only
> knowing the relationship types. I have found a way of doing something like
> that by changing my database design. It will not be very efficient, but it
> should work.

If you'd like, you can post your proposed solution here for people to take
part of and get inspired by. As for the matching problem, Paul makes a good
point in his e-mail regarding finding a good starting point for the matching
(choose your start node so that there are as few potential matches as
possible). And as for the text search, it won't work for sub-molecule
matching, I think.


> Thanks again for your help, I think I know what to do now.
> Best regards,
> Jonathan
> --
> Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief!
> Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

More information about the User mailing list