[Neo] Import/export

Craig Taverner craig at amanzi.com
Tue Jan 19 10:07:09 CET 2010


I was wondering if the neo-shell or the neo4j.rb in IRB would solve this
requirement (easily creating or loading some initial graph). I have not
played much with the shell, but know that it has commands for making nodes
and relationships. But I think it is best for interactive work, and I think
it is not ideal for scripting. On the other hand the Ruby API provides a
command-line/scripting DSL for generating a graph and since it is also very
easy to read data from a file, it is easy to read the file and create the
graph in a language not entirely unlike your original 'javascript-like'
example.

While on that note, I said javascript-like, but if we focus on the 'state
transfer' example you gave, we're talking about JSON, and I think I might
get a few votes for suggesting that as a nicer alternative to XML for a
generic data structure.

So, if you use JSON, the Java API and a JSON library would suffice to build
the graph. If you are willing to deviate a little from the syntax, you could
have the format in Ruby and directly executable in neo4j.rb (which I think
is even cooler :-)

I also personally think both XML and JSON represent implicit tree
structures, and so any XML or JSON dataset can be loaded as a tree graph
with generic code (and no need for hashes of node ids or any caches).
However, things get slightly tricky when we need to translate XML/JSON
closed graph contructs into the graph, but even that seems achievable (with
hashes/caches ;-)

On Tue, Jan 19, 2010 at 8:55 AM, David Montag <david at montag.se> wrote:

> Hi,
>
> Having read the replies and thought about it more, I think my initial
> e-mail
> had a slightly wrong focus. The technical details that have surfaced so far
> are interesting, and would definitely be relevant, should an implementation
> be attempted.
>
> However.
>
> What I personally would like to know is, do you think there's a need for
> initial data sets in the first place? Because that is the problem that I
> initially set out to solve. Then I kind of got ahead of myself and started
> thinking about the hows, and not the whats and whys. Simply zipping up a
> couple of pre-populated stores with different graphs would actually solve
> the problem. Maybe not in the most elegant and/or maintainable way, but
> still. Export/import is a much broader feature.
>
> Opinions? Don't get me wrong, I'm not trying to kill the tech discussion.
> I'm just trying to solve the actual problem that I ran into. And if you
> think export/import would be useful too, great! I'd be happy to continue
> that discussion as well.
>
> Also, let me make it clear that this (i.e. initial data sets) isn't
> something I'm doing as a project for myself. I would expect it to be a
> community effort, benefiting everyone. So I actually *want* to know if you
> like the ideas or not, in addition to solutions. With the awesomeness that
> is the Neo4j community, it shouldn't be a problem. :)
>
> -David
>
> On Tue, Jan 19, 2010 at 2:32 AM, Rick Bullotta <
> rick.bullotta at burningskysoftware.com> wrote:
>
> > Actually, I think there's one other key "gotcha" to be aware of.
> >
> > Rewiring relationships when importing should not assume anything about
> the
> > nodeID's.  While the nodeID's are a useful "unique identifier" in the
> > export
> > process, on import, you'd want to create a HashMap or similar structure
> > that
> > you populate with the "old" and "new" node ID's as you create them in the
> > first pass through (nodes/properties), then use the "old" nodeIDs
> > referenced
> > in the exported relationships as your lookup to get the "new" nodeIDs.
> >
> > Could be kinda memory intensive for really large graphs (since you'd have
> > to
> > keep a HashMap entry of Long/Long for each node), but probably
> manageable.
> > In the worst case you could keep the translation table on disk and chunk
> it
> > in as needed.
> >
> > -----Original Message-----
> > From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org]
> > On
> > Behalf Of Rob Challen
> > Sent: Monday, January 18, 2010 6:25 PM
> > To: Neo user discussions
> > Subject: Re: [Neo] Import/export
> >
> > Rdf seems a good candidate to me.
> >
> > Having said that it might just be pretty easy to write out the graph
> > in a spreadsheet (nodes and properties in one tab and relationship
> > triples and properties in another) and import that, as long as you
> > aren't fussed about maintaining data types.
> >
> > Rob.
> >
> > On 18/01/2010, Peter Neubauer <neubauer.peter at gmail.com> wrote:
> > > Hi David,
> > > one thing would be to provide example node spaces, maybe even as
> > > Amazon EC2 AMIs, or downloadable nodespaces.
> > >
> > > Regrading XML format, I think GraphML is the most standard thing
> > > there, Gremlin already has a GraphML importer that can be used to
> > > import data into Neo4j,
> > >
> >
> http://wiki.github.com/tinkerpop/gremlin/graphml-reader-and-writer-library
> > > . Probably not hard to write directly onto Neo4j.
> > >
> > > Anyone knowing about a good other binary format?
> > >
> > > WDYT?
> > >
> > > Cheers,
> > >
> > > /peter neubauer
> > >
> > > COO and Sales, Neo Technology
> > >
> > > GTalk:      neubauer.peter
> > > Skype       peter.neubauer
> > > Phone       +46 704 106975
> > > LinkedIn   http://www.linkedin.com/in/neubauer
> > > Twitter      http://twitter.com/peterneubauer
> > >
> > > http://www.neo4j.org                - Your high performance graph
> > database.
> > > http://gremlin.tinkerpop.com    - PageRank in 2 lines of code.
> > >
> > >
> > >
> > > On Mon, Jan 18, 2010 at 8:37 PM, David Montag <david at montag.se> wrote:
> > >> Hi,
> > >>
> > >> This weekend I was toying around with Neo4j. I wanted to do some
> > indexing
> > >> experiments. Unfortunately I found myself without a graph to work
> with.
> > >> Sure, I could write some code to generate a graph for me, but it'd be
> a
> > >> one-time-thing. I wanted to get going *now*. That got me thinking
> about
> > >> import/export functionality.
> > >>
> > >> I think a command-line import tool would be useful, accompanied by
> (and
> > >> built on) a Java API. Both of them would be tied to a certain
> > >> representation
> > >> format. The export can be represented in different ways, where two
> > >> possible
> > >> ways are:
> > >> - State transfer: (node{id:1, name:foo}, node{id:2},
> rel{start:1,end:2,
> > >> type=bar}, ...)
> > >> - Operation transfer: (id1 = create node, id2 = create node, create
> rel
> > >> id1->id2 type bar, ...)
> > >>
> > >> I guess the state transfer feels like the more straightforward one.
> The
> > >> diff-style nature of the operation transfer might be useful in other
> > >> cases.
> > >>
> > >> When I first thought of this, the target user was somebody who wanted
> to
> > >> get
> > >> started with a graph, and didn't want to write code to do an import
> > >> "manually". Maybe the import/export can extend to other use cases, but
> > >> this
> > >> was the primary one. A possible workflow could be db exported to file,
> > >> file
> > >> published, file downloaded, file imported into db.
> > >>
> > >> In the end, it would be great if new users could download sample data
> > sets
> > >> and import them into a Neo4j instance without writing a single line of
> > >> code.
> > >> Which also gets me thinking about a command-line tool to create an
> empty
> > >> Neo4j instance to import into. The actual implementations of the tools
> > are
> > >> trivial. It's the discussion that leads to the implementation that's
> > >> important.
> > >>
> > >> Does this sound like anything that would interest people? If so,
> > (digging
> > >> into details) what kind of representation do you guys think would be
> > best?
> > >> I
> > >> was thinking XML, but a binary format might be better for performance
> > >> (size/primitives ratio). Maybe both? Because I do like the idea of a
> > >> human-readable (and editable) format. If you don't think it would be
> > >> useful
> > >> I would love to hear why.
> > >>
> > >> This is just a brain dump of my thoughts. Surely others have thought
> of
> > >> this
> > >> as well. I'm just getting the discussion started. WDYT?
> > >>
> > >> -David
> > >> _______________________________________________
> > >> Neo mailing list
> > >> User at lists.neo4j.org
> > >> https://lists.neo4j.org/mailman/listinfo/user
> > >>
> > > _______________________________________________
> > > Neo mailing list
> > > User at lists.neo4j.org
> > > https://lists.neo4j.org/mailman/listinfo/user
> > >
> >
> > --
> > Sent from my mobile device
> > _______________________________________________
> > Neo mailing list
> > User at lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> > _______________________________________________
> > Neo mailing list
> > User at lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> _______________________________________________
> Neo mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>


More information about the User mailing list