[Neo] Import/export

Rick Bullotta rick.bullotta at burningskysoftware.com
Tue Jan 19 12:26:55 CET 2010


Yes, there is a need, and to make it work, we needed to work out the technical details!

It was sheer coincidence, but I wrote code for this exact use case yesterday.  I'm using an xml file as the source for prepopulating our neo structures.  It is very domain specific, however, therefore much of the relationship logic is in the code rather than the xml file. 

In any case, I agree with your general view that import and export are of value. 

 

-----Original Message-----
From: David Montag <david at montag.se>
Date: Tue, 19 Jan 2010 08:55:29 
To: Neo user discussions<user at lists.neo4j.org>
Subject: Re: [Neo] Import/export

Hi,

Having read the replies and thought about it more, I think my initial e-mail
had a slightly wrong focus. The technical details that have surfaced so far
are interesting, and would definitely be relevant, should an implementation
be attempted.

However.

What I personally would like to know is, do you think there's a need for
initial data sets in the first place? Because that is the problem that I
initially set out to solve. Then I kind of got ahead of myself and started
thinking about the hows, and not the whats and whys. Simply zipping up a
couple of pre-populated stores with different graphs would actually solve
the problem. Maybe not in the most elegant and/or maintainable way, but
still. Export/import is a much broader feature.

Opinions? Don't get me wrong, I'm not trying to kill the tech discussion.
I'm just trying to solve the actual problem that I ran into. And if you
think export/import would be useful too, great! I'd be happy to continue
that discussion as well.

Also, let me make it clear that this (i.e. initial data sets) isn't
something I'm doing as a project for myself. I would expect it to be a
community effort, benefiting everyone. So I actually *want* to know if you
like the ideas or not, in addition to solutions. With the awesomeness that
is the Neo4j community, it shouldn't be a problem. :)

-David

On Tue, Jan 19, 2010 at 2:32 AM, Rick Bullotta <
rick.bullotta at burningskysoftware.com> wrote:

> Actually, I think there's one other key "gotcha" to be aware of.
>
> Rewiring relationships when importing should not assume anything about the
> nodeID's.  While the nodeID's are a useful "unique identifier" in the
> export
> process, on import, you'd want to create a HashMap or similar structure
> that
> you populate with the "old" and "new" node ID's as you create them in the
> first pass through (nodes/properties), then use the "old" nodeIDs
> referenced
> in the exported relationships as your lookup to get the "new" nodeIDs.
>
> Could be kinda memory intensive for really large graphs (since you'd have
> to
> keep a HashMap entry of Long/Long for each node), but probably manageable.
> In the worst case you could keep the translation table on disk and chunk it
> in as needed.
>
> -----Original Message-----
> From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org]
> On
> Behalf Of Rob Challen
> Sent: Monday, January 18, 2010 6:25 PM
> To: Neo user discussions
> Subject: Re: [Neo] Import/export
>
> Rdf seems a good candidate to me.
>
> Having said that it might just be pretty easy to write out the graph
> in a spreadsheet (nodes and properties in one tab and relationship
> triples and properties in another) and import that, as long as you
> aren't fussed about maintaining data types.
>
> Rob.
>
> On 18/01/2010, Peter Neubauer <neubauer.peter at gmail.com> wrote:
> > Hi David,
> > one thing would be to provide example node spaces, maybe even as
> > Amazon EC2 AMIs, or downloadable nodespaces.
> >
> > Regrading XML format, I think GraphML is the most standard thing
> > there, Gremlin already has a GraphML importer that can be used to
> > import data into Neo4j,
> >
> http://wiki.github.com/tinkerpop/gremlin/graphml-reader-and-writer-library
> > . Probably not hard to write directly onto Neo4j.
> >
> > Anyone knowing about a good other binary format?
> >
> > WDYT?
> >
> > Cheers,
> >
> > /peter neubauer
> >
> > COO and Sales, Neo Technology
> >
> > GTalk:      neubauer.peter
> > Skype       peter.neubauer
> > Phone       +46 704 106975
> > LinkedIn   http://www.linkedin.com/in/neubauer
> > Twitter      http://twitter.com/peterneubauer
> >
> > http://www.neo4j.org                - Your high performance graph
> database.
> > http://gremlin.tinkerpop.com    - PageRank in 2 lines of code.
> >
> >
> >
> > On Mon, Jan 18, 2010 at 8:37 PM, David Montag <david at montag.se> wrote:
> >> Hi,
> >>
> >> This weekend I was toying around with Neo4j. I wanted to do some
> indexing
> >> experiments. Unfortunately I found myself without a graph to work with.
> >> Sure, I could write some code to generate a graph for me, but it'd be a
> >> one-time-thing. I wanted to get going *now*. That got me thinking about
> >> import/export functionality.
> >>
> >> I think a command-line import tool would be useful, accompanied by (and
> >> built on) a Java API. Both of them would be tied to a certain
> >> representation
> >> format. The export can be represented in different ways, where two
> >> possible
> >> ways are:
> >> - State transfer: (node{id:1, name:foo}, node{id:2}, rel{start:1,end:2,
> >> type=bar}, ...)
> >> - Operation transfer: (id1 = create node, id2 = create node, create rel
> >> id1->id2 type bar, ...)
> >>
> >> I guess the state transfer feels like the more straightforward one. The
> >> diff-style nature of the operation transfer might be useful in other
> >> cases.
> >>
> >> When I first thought of this, the target user was somebody who wanted to
> >> get
> >> started with a graph, and didn't want to write code to do an import
> >> "manually". Maybe the import/export can extend to other use cases, but
> >> this
> >> was the primary one. A possible workflow could be db exported to file,
> >> file
> >> published, file downloaded, file imported into db.
> >>
> >> In the end, it would be great if new users could download sample data
> sets
> >> and import them into a Neo4j instance without writing a single line of
> >> code.
> >> Which also gets me thinking about a command-line tool to create an empty
> >> Neo4j instance to import into. The actual implementations of the tools
> are
> >> trivial. It's the discussion that leads to the implementation that's
> >> important.
> >>
> >> Does this sound like anything that would interest people? If so,
> (digging
> >> into details) what kind of representation do you guys think would be
> best?
> >> I
> >> was thinking XML, but a binary format might be better for performance
> >> (size/primitives ratio). Maybe both? Because I do like the idea of a
> >> human-readable (and editable) format. If you don't think it would be
> >> useful
> >> I would love to hear why.
> >>
> >> This is just a brain dump of my thoughts. Surely others have thought of
> >> this
> >> as well. I'm just getting the discussion started. WDYT?
> >>
> >> -David
> >> _______________________________________________
> >> Neo mailing list
> >> User at lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> >>
> > _______________________________________________
> > Neo mailing list
> > User at lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
> --
> Sent from my mobile device
> _______________________________________________
> Neo mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
> _______________________________________________
> Neo mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo mailing list
User at lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


More information about the User mailing list