[Neo4j] [bug?] Unicode node property not returned correctly from bulk REST index search

Peter Neubauer peter.neubauer at neotechnology.com
Fri Oct 21 10:26:11 CEST 2011


Thanks Nuo,

submitted as a bug at https://github.com/neo4j/community/issues/69

Feel free to track it - hope to get to it next week.

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org              - NOSQL for the Enterprise.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.



On Fri, Oct 21, 2011 at 10:07 AM, Nuo Yan <yan.nuo at gmail.com> wrote:
> Hey Peter, using the example of my original email, this is the corresponding
> json body I post when I created the node:
>
> "{\"uid\":\"12345\",\"name\":\"\\u4f8b\\u5b50\"}"
>
> A single GET query returns the name data (as raw json) as:
>
> \"name\" : \"\xE4\xBE\x8B\xE5\xAD\x90\"\n
>
> which after JSON parse becomes the correct one:
>
> "name" => "例子"
>
> When sending the same request as a POST to /batch, the server returns the
> following raw json:
>
> \"name\" :
> \"\xEF\xBF\xA4\xEF\xBE\xBE\xEF\xBE\x8B\xEF\xBF\xA5\xEF\xBE\xAD\xEF\xBE\x90\"\n
>
>
> which after JSON parse becomes the busted result:
>
> "name"=>"¦ᄒヒ¥ᆳミ"
>
>
>
>
>
> On Thu, Oct 20, 2011 at 11:28 PM, Peter Neubauer <
> peter.neubauer at neotechnology.com> wrote:
>
>> Guys,
>> Do you have a JSON string you would use to set properties on a node? Can
>> then update a test with it and check.
>> On Oct 21, 2011 7:27 AM, "Nuo Yan" <yan.nuo at gmail.com> wrote:
>>
>> > Yea, I'm pretty sure it's not a client parse issue. The data is correct
>> in
>> > the database, and a single GET query returns the right data, only when
>> > doing
>> > the same request as a part of the bulk request, it returns busted data.
>> >
>> > It can be reproduced using curl and as well as rest client. I'm using
>> 1.4.2
>> > stable.
>> >
>> > Anyone from the neo4j team has any insight on this?
>> >
>> >
>> >
>> >
>> > On Thu, Oct 20, 2011 at 7:59 PM, Rick Bullotta
>> > <rick.bullotta at thingworx.com>wrote:
>> >
>> > > I doubt it, since a GET works fine.  It's probably an encoding issue
>> > > somewhere in the batch processing pipeline.
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: user-bounces at lists.neo4j.org [mailto:
>> user-bounces at lists.neo4j.org]
>> > > On Behalf Of Daniel Fitzpatrick
>> > > Sent: Thursday, October 20, 2011 10:37 PM
>> > > To: Neo4j user discussions
>> > > Subject: Re: [Neo4j] [bug?] Unicode node property not returned
>> correctly
>> > > from bulk REST index search
>> > >
>> > > Possibly an issue with the client code not understanding unicode.  Is
>> > there
>> > > something you could use as a baseline to rule the database out eg maybe
>> > the
>> > > web admin?
>> > >
>> > > On Oct 20, 2011 7:48 PM, "Nuo Yan" <yan.nuo at gmail.com> wrote:
>> > >
>> > > I have nodes with data properties with unicode (Chinese/Japanese)
>> > > characters
>> > > such as:
>> > >
>> > > {"uid" => "12345", "name" => "例子"}
>> > >
>> > > I index such nodes with their id, so that by doing this (where
>> > users_index
>> > > is the index, uid is the key, 12345 is the value):
>> > >
>> > > GET to /index/node/users_index/uid/12345
>> > >
>> > > I can get back the right result:
>> > >
>> > > {"indexed"=>"
>> > > http://localhost:7474/db/data/index/node/users_node/uid/12345/6638",
>> > > "outgoing_relationships"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships/out",
>> > >
>> > > * "data"=>{"uid"=>"12345", "name"=>"例子"}, *
>> > >
>> > > "traverse"=>"
>> > http://localhost:7474/db/data/node/6638/traverse/{returnType}
>> > > ",
>> > > "all_typed_relationships"=>"
>> > >
>> >
>> http://localhost:7474/db/data/node/6638/relationships/all/{-list|&|types}
>> > > ",
>> > > "property"=>"http://localhost:7474/db/data/node/6638/properties/{key}
>> ",
>> > > "self"=>"http://localhost:7474/db/data/node/6638", "properties"=>"
>> > > http://localhost:7474/db/data/node/6638/properties",
>> > > "outgoing_typed_relationships"=>"
>> > >
>> >
>> http://localhost:7474/db/data/node/6638/relationships/out/{-list|&|types}
>> > > ",
>> > > "incoming_relationships"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships/in",
>> > > "extensions"=>{},
>> > > "create_relationship"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships",
>> > > "paged_traverse"=>"
>> > >
>> > >
>> >
>> http://localhost:7474/db/data/node/6638/paged/traverse/{returnType}{?pageSize,leaseTime}
>> > > ",
>> > > "all_relationships"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships/all",
>> > > "incoming_typed_relationships"=>"
>> > >
>> http://localhost:7474/db/data/node/6638/relationships/in/{-list|&|types}
>> > "}
>> > >
>> > >
>> > > However, if I do the same search query as a part of a bulk REST
>> request:
>> > >
>> > > POST to /batch:
>> > >
>> > > [{"method" => "GET",
>> > > "to" => "/index/node/users_index/uid/12345",
>> > > "body" => {},
>> > > "id" => 0}]
>> > >
>> > > Returns the node in the body, however, with bad characters in the data
>> > > field:
>> > >
>> > > [{"id"=>0, "body"=>[{"indexed"=>"
>> > > http://localhost:7474/db/data/index/node/users_node/uid/12345/6638",
>> > > "outgoing_relationships"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships/out",
>> > >
>> > > *"data"=>{"uid"=>"12345", "name"=>"¥ᄂᄃ¥ツᄏ\uFFE7モワ"}, *
>> > >
>> > > "traverse"=>"
>> > http://localhost:7474/db/data/node/6638/traverse/{returnType}
>> > > ",
>> > > "all_typed_relationships"=>"
>> > >
>> >
>> http://localhost:7474/db/data/node/6638/relationships/all/{-list|&|types}
>> > > ",
>> > > "property"=>"http://localhost:7474/db/data/node/6638/properties/{key}
>> ",
>> > > "self"=>"http://localhost:7474/db/data/node/6638", "properties"=>"
>> > > http://localhost:7474/db/data/node/6638/properties",
>> > > "outgoing_typed_relationships"=>"
>> > >
>> >
>> http://localhost:7474/db/data/node/6638/relationships/out/{-list|&|types}
>> > > ",
>> > > "incoming_relationships"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships/in",
>> > > "extensions"=>{},
>> > > "create_relationship"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships",
>> > > "paged_traverse"=>"
>> > >
>> > >
>> >
>> http://localhost:7474/db/data/node/6638/paged/traverse/{returnType}{?pageSize,leaseTime}
>> > > ",
>> > > "all_relationships"=>"
>> > > http://localhost:7474/db/data/node/6638/relationships/all",
>> > > "incoming_typed_relationships"=>"
>> > >
>> http://localhost:7474/db/data/node/6638/relationships/in/{-list|&|types}
>> > > "}],
>> > > "from"=>"/index/node/users_node/uid/12345"}]
>> > >
>> > > Do you think if this is a bug or is there anything I can change to make
>> > the
>> > > bulk request return the correct Chinese/Japanese characters? I can
>> > > reproduce
>> > > this all the time.
>> > >
>> > > Thanks,
>> > > Nuo
>> > > _______________________________________________
>> > > Neo4j mailing list
>> > > User at lists.neo4j.org
>> > > https://lists.neo4j.org/mailman/listinfo/user
>> > > _______________________________________________
>> > > Neo4j mailing list
>> > > User at lists.neo4j.org
>> > > https://lists.neo4j.org/mailman/listinfo/user
>> > > _______________________________________________
>> > > Neo4j mailing list
>> > > User at lists.neo4j.org
>> > > https://lists.neo4j.org/mailman/listinfo/user
>> > >
>> > _______________________________________________
>> > Neo4j mailing list
>> > User at lists.neo4j.org
>> > https://lists.neo4j.org/mailman/listinfo/user
>> >
>> _______________________________________________
>> Neo4j mailing list
>> User at lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
> _______________________________________________
> Neo4j mailing list
> User at lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>


More information about the User mailing list