[Neo] Import/export

Rick Bullotta rick.bullotta at burningskysoftware.com
Tue Jan 19 02:32:32 CET 2010


Actually, I think there's one other key "gotcha" to be aware of.  

Rewiring relationships when importing should not assume anything about the
nodeID's.  While the nodeID's are a useful "unique identifier" in the export
process, on import, you'd want to create a HashMap or similar structure that
you populate with the "old" and "new" node ID's as you create them in the
first pass through (nodes/properties), then use the "old" nodeIDs referenced
in the exported relationships as your lookup to get the "new" nodeIDs.  

Could be kinda memory intensive for really large graphs (since you'd have to
keep a HashMap entry of Long/Long for each node), but probably manageable.
In the worst case you could keep the translation table on disk and chunk it
in as needed.

-----Original Message-----
From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org] On
Behalf Of Rob Challen
Sent: Monday, January 18, 2010 6:25 PM
To: Neo user discussions
Subject: Re: [Neo] Import/export

Rdf seems a good candidate to me.

Having said that it might just be pretty easy to write out the graph
in a spreadsheet (nodes and properties in one tab and relationship
triples and properties in another) and import that, as long as you
aren't fussed about maintaining data types.

Rob.

On 18/01/2010, Peter Neubauer <neubauer.peter at gmail.com> wrote:
> Hi David,
> one thing would be to provide example node spaces, maybe even as
> Amazon EC2 AMIs, or downloadable nodespaces.
>
> Regrading XML format, I think GraphML is the most standard thing
> there, Gremlin already has a GraphML importer that can be used to
> import data into Neo4j,
> http://wiki.github.com/tinkerpop/gremlin/graphml-reader-and-writer-library
> . Probably not hard to write directly onto Neo4j.
>
> Anyone knowing about a good other binary format?
>
> WDYT?
>
> Cheers,
>
> /peter neubauer
>
> COO and Sales, Neo Technology
>
> GTalk:      neubauer.peter
> Skype       peter.neubauer
> Phone       +46 704 106975
> LinkedIn   http://www.linkedin.com/in/neubauer
> Twitter      http://twitter.com/peterneubauer
>
> http://www.neo4j.org                - Your high performance graph
database.
> http://gremlin.tinkerpop.com    - PageRank in 2 lines of code.
>
>
>
> On Mon, Jan 18, 2010 at 8:37 PM, David Montag <david at montag.se> wrote:
>> Hi,
>>
>> This weekend I was toying around with Neo4j. I wanted to do some indexing
>> experiments. Unfortunately I found myself without a graph to work with.
>> Sure, I could write some code to generate a graph for me, but it'd be a
>> one-time-thing. I wanted to get going *now*. That got me thinking about
>> import/export functionality.
>>
>> I think a command-line import tool would be useful, accompanied by (and
>> built on) a Java API. Both of them would be tied to a certain
>> representation
>> format. The export can be represented in different ways, where two
>> possible
>> ways are:
>> - State transfer: (node{id:1, name:foo}, node{id:2}, rel{start:1,end:2,
>> type=bar}, ...)
>> - Operation transfer: (id1 = create node, id2 = create node, create rel
>> id1->id2 type bar, ...)
>>
>> I guess the state transfer feels like the more straightforward one. The
>> diff-style nature of the operation transfer might be useful in other
>> cases.
>>
>> When I first thought of this, the target user was somebody who wanted to
>> get
>> started with a graph, and didn't want to write code to do an import
>> "manually". Maybe the import/export can extend to other use cases, but
>> this
>> was the primary one. A possible workflow could be db exported to file,
>> file
>> published, file downloaded, file imported into db.
>>
>> In the end, it would be great if new users could download sample data
sets
>> and import them into a Neo4j instance without writing a single line of
>> code.
>> Which also gets me thinking about a command-line tool to create an empty
>> Neo4j instance to import into. The actual implementations of the tools
are
>> trivial. It's the discussion that leads to the implementation that's
>> important.
>>
>> Does this sound like anything that would interest people? If so, (digging
>> into details) what kind of representation do you guys think would be
best?
>> I
>> was thinking XML, but a binary format might be better for performance
>> (size/primitives ratio). Maybe both? Because I do like the idea of a
>> human-readable (and editable) format. If you don't think it would be
>> useful
>> I would love to hear why.
>>
>> This is just a brain dump of my thoughts. Surely others have thought of
>> this
>> as well. I'm just getting the discussion started. WDYT?
>>
>> -David
>> _______________________________________________
>> Neo mailing list
>> User at lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
> _______________________________________________
> Neo mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>

-- 
Sent from my mobile device
_______________________________________________
Neo mailing list
User at lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user



More information about the User mailing list