[Neo] Batch Insert in Python

Jon Noronha cheeselord at gmail.com
Thu Apr 15 20:13:31 CEST 2010


Hello,

I'm wondering if neo4j.py has any way of running a batch insert.

I'm interested in building a DB of about 10 million nodes and 60
million edges. My initial approach was to add each node one at a time
in a transaction. Each transaction would create the node, then add
each edge coming off it, adding the other node if necessary. Nodes and
edges each have a small number of properties.

This ended up taking forever and would have taken 25 days to run on
the whole DB, so now I'm trying to find a better way. One thing I
tried was doing bulk insertions, adding all the nodes first and then
adding edges in batches of 100,000. This seems faster but I still
wonder if I'm wasting a lot of time with transactions, etc.

How do you all recommend going about this?

Thanks,
Jon

PS: I'm brand new to Neo4j, so please take nothing for granted in my
understanding :)


More information about the User mailing list