[Neo] Batch Insert in Python
Jon Noronha
cheeselord at gmail.com
Thu Apr 15 20:13:31 CEST 2010
Hello,
I'm wondering if neo4j.py has any way of running a batch insert.
I'm interested in building a DB of about 10 million nodes and 60
million edges. My initial approach was to add each node one at a time
in a transaction. Each transaction would create the node, then add
each edge coming off it, adding the other node if necessary. Nodes and
edges each have a small number of properties.
This ended up taking forever and would have taken 25 days to run on
the whole DB, so now I'm trying to find a better way. One thing I
tried was doing bulk insertions, adding all the nodes first and then
adding edges in batches of 100,000. This seems faster but I still
wonder if I'm wasting a lot of time with transactions, etc.
How do you all recommend going about this?
Thanks,
Jon
PS: I'm brand new to Neo4j, so please take nothing for granted in my
understanding :)
More information about the User
mailing list