[Neo4j] Why node's index entry can not be removed from lucene index after the node is deleted from the graph

Luanne Misquitta LMisquitta at saba.com
Wed Dec 15 05:44:58 CET 2010


Hi,

Please see inline.


I indexed several properties on one node, but I missed to remove one
property(say, property A) before node.delete().
In such case, seems the index document will not deleted correctly. 
[Luanne] I guess once you've lost reference to the node, you can no
longer delete from the index.

Although the original document is marked deleted in Luke, but it is
strange that a
new document with the property A is created using the same _id_.
[Luanne] This is actually correct. Lucene does not support an update of
a document. To update a field, you delete the document and add it again
(with the updated information). So what has happened, is the first
document is marked deleted, and a new document is created with the same
_id_ (which is the Node ID).

I further encountered a transaction exception, When this node id is
re-assigned to a new node......
[Luanne] Did not get you on this one

So the correct way to delete a node should be removing all the index
before
invoking delete(). Right?
[Luanne] Yes, I'd imagine so. Also, node ID's can be re-used so if you
leave stuff hanging around in the index, indexing a new node (which
might re-use an older Node ID) might actually just update the document
in Lucene for the older node. Best to clean up all references.

It *would* be nice however, to have a method on the Index API which just
accepts a Node, and deletes the Lucene document (so that we don't have
to iterate through all properties and delete).

Hope this helps a bit
Luanne
-----Original Message-----
From: user-bounces at lists.neo4j.org [mailto:user-bounces at lists.neo4j.org]
On Behalf Of Samuel Feng
Sent: Tuesday, December 14, 2010 3:44 PM
To: Neo4j user discussions
Subject: Re: [Neo4j] Why node's index entry can not be removed from
lucene index after the node is deleted from the graph

Thanks Luanne,

My case:

I indexed several properties on one node, but I missed to remove one
property(say, property A) before node.delete().

In such case, seems the index document will not deleted correctly.
Although
the original document is marked deleted in Luke, but it is strange that
a
new document with the property A is created using the same _id_.

I further encountered a transaction exception, When this node id is
re-assigned to a new node......

So the correct way to delete a node should be removing all the index
before
invoking delete(). Right?


2010/12/14 Luanne Misquitta <LMisquitta at saba.com>

> That's strange. I indexed a single property on two nodes, and then
used
> remove method on the index to remove that property- then checked in
Luke,
> and the documents appear to be there, but marked deleted. I could
undelete
> them also via Luke. When I did a Luke->Optimize index, then the
documents
> were well and truly deleted. Not sure what the intended behavior of
the
> index remove is now that you bring this up.
>
> Regards
>
> Luanne M.
> Tech Lead
>
> twitter / @luannem
> linkedin / http://in.linkedin.com/in/luannemisquitta
> skype / luanne.misquitta
> blog / http://thought-bytes.blogspot.com/
> Saba. Power Up Your People.
>
>
> -----Original Message-----
> From: user-bounces at lists.neo4j.org
[mailto:user-bounces at lists.neo4j.org]
> On Behalf Of Samuel Feng
> Sent: Tuesday, December 14, 2010 12:31 PM
> To: Neo4j user discussions
> Subject: Re: [Neo4j] Why node's index entry can not be removed from
lucene
> index after the node is deleted from the graph
>
> I checked the index.remove(), it can only delete the key filed from
the
> node
> entry in index. (The node entry is a document in lucene index)
>
> The node entry is still there....
>
> 2010/12/14 Luanne Misquitta <LMisquitta at saba.com>
>
> > You could perhaps also delete the data from the index using:
> > Index.remove(T entity, String key, Object value)
> >
> > However, I assume that if you've added x properties of the node to
the
> > index, you'd have to remove all those X properties?
> >
> > Regards
> > Luanne M.
> > Tech Lead
> >
> > twitter / @luannem
> > linkedin / http://in.linkedin.com/in/luannemisquitta
> > skype / luanne.misquitta
> > blog / http://thought-bytes.blogspot.com/
> > Saba. Power Up Your People.
> >
> >
> > -----Original Message-----
> > From: user-bounces at lists.neo4j.org
[mailto:user-bounces at lists.neo4j.org]
> > On Behalf Of Samuel Feng
> > Sent: Tuesday, December 14, 2010 12:11 PM
> > To: Neo4j user discussions
> > Subject: [Neo4j] Why node's index entry can not be removed from
lucene
> > index after the node is deleted from the graph
> >
> > Dears,
> >
> > Index<Node> persons = graphDb.index().forNodes( "persons" );
> > Node firstPerson = graphDb.createNode();
> > Node secondPerson = graphDb.createNode();
> > persons.add( firstPerson, "name", "Mattias Persson" );
> >
> > Now I have to delete the firstPerson node from the graph.
> > i.e firstPerson.delete()
> >
> > After the transaction committed, the node is deleted from the graph.
> > However, I then further check the lucene index using "LUKE", still
can
> find
> > the node's entry in it.
> >
> > I understand that this node entry will not affect the query result
of
> > neo4j.
> > But If my graph has many deletions, there will produce a lot
> > of obsoleted node entry in lucene index.
> >
> > Do u have any ideas that how I can delete this obsoleted index
entries or
> > neo4j has some other mechanism to do this?
> > _______________________________________________
> > Neo4j mailing list
> > User at lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> > _______________________________________________
> > Neo4j mailing list
> > User at lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User at lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


More information about the User mailing list