[Neo4j] Lucene sort with diacritic characters

Rick Bullotta rick.bullotta at thingworx.com
Fri Nov 11 15:33:44 CET 2011


You probably need to create a custom analyzer using one of Lucene's collation filters (which you will provide as a parameter to the Neo4J index creation method).  Unfortunately, you can't apply a new analyzer "after the fact".  I think you'll need to delete and regenerate the index.  Lucene has some built-in language specific collation filters, but there is also a contributed package, ICUCollationKeyFilter, which may have some advantages in terms of performance.  Unfortunately, I do not direct experience in using either, but hopefully this will help get you pointed in the right direction.

Rick



________________________________________
From: user-bounces at lists.neo4j.org [user-bounces at lists.neo4j.org] On Behalf Of Niels Hoogeveen [pd_aficionado at hotmail.com]
Sent: Friday, November 11, 2011 9:27 AM
To: user at lists.neo4j.org
Subject: Re: [Neo4j] Lucene sort with diacritic characters

anyone?

> From: pd_aficionado at hotmail.com
> To: user at lists.neo4j.org
> Date: Thu, 10 Nov 2011 20:20:46 +0100
> Subject: [Neo4j] Lucene sort with diacritic characters
>
>
> When retrieving items from a Lucene index, using the sort method, it seems the order doesn't abide proper rules for sorting diacritic characters.
> For example, Århus comes later in the list than Zürich and Ḩalab comes later than Žužemberk.
> Can someone help me solve this?
> Niels
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
User at lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


More information about the User mailing list