Latin1 encoding

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Latin1 encoding

JohnHays
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Latin1 encoding

BJ Freeman
UTF8 works fine.

if you use the datafile importing to make your XML files, this should
all be taken care of for you.
http://docs.ofbiz.org/display/OFBENDUSER/OFBiz%27s+Data+File+Tools

Also there are some incomplete notes how how to transfer from one DB to
ofbiz
http://docs.ofbiz.org/display/OFBIZ/Handling+of+External+data
see #4


John Hays sent the following on 5/12/2009 1:59 PM:

> I am looking at importing thousands of records from an existing system
> into OFBIZ.  Our traditional storage charset is UTF8.
>
> I note that OFBIZ seems to like to use Latin1 as its encoding charset.
>
> When I load a generated XML file (created using webtools against a tab
> delimited file), from the command line, it chokes on any accented
> characters in names, etc. (Seems to work better through the form interface)
>
> If I move the tables to UTF8, do I break OFBIZ?
>
>
> John D. Hays
> Director of Information Technology
>
>
>
> www.mavericklabel.com
> 120 West Dayton Street
> Edmonds, WA 98020-4180
>
>

--
BJ Freeman
http://www.businessesnetwork.com/automation
http://bjfreeman.elance.com
http://www.linkedin.com/profile?viewProfile=&key=1237480&locale=en_US&trk=tab_pro
Systems Integrator.

Reply | Threaded
Open this post in threaded view
|

Re: Latin1 encoding

David E Jones-3
In reply to this post by JohnHays

Could you be more specific? What makes you think OFBiz likes Latin1 as  
an encoding character set?

As a random guess of the direction you're going: which database are  
you using, is it MySQL?

-David


On May 12, 2009, at 2:59 PM, John Hays wrote:

> I am looking at importing thousands of records from an existing  
> system into OFBIZ.  Our traditional storage charset is UTF8.
>
> I note that OFBIZ seems to like to use Latin1 as its encoding charset.
>
> When I load a generated XML file (created using webtools against a  
> tab delimited file), from the command line, it chokes on any  
> accented characters in names, etc. (Seems to work better through the  
> form interface)
>
> If I move the tables to UTF8, do I break OFBIZ?
>
>
> John D. Hays
> Director of Information Technology
>
>
>
> www.mavericklabel.com
> 120 West Dayton Street
> Edmonds, WA 98020-4180
>

Reply | Threaded
Open this post in threaded view
|

Re: Latin1 encoding

JohnHays
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Latin1 encoding

David E Jones-3
On Tue, 2009-05-12 at 18:35 -0700, John D. Hays wrote:
> David E Jones wrote:
> >
> > Could you be more specific? What makes you think OFBiz likes Latin1 as
> > an encoding character set?
> >
> Its in the entity definitions on the SVN code.

I'm not sure what this means... do you mean specifically the
entityengine.xml file?

> > As a random guess of the direction you're going: which database are
> > you using, is it MySQL?
>
> Yes, MySQL.

This could be the issue. MySQL doesn't handle UTF-8 (or any multi-byte
character set) very well. There are (or used to be) some JDBC driver
issues, but the big problem is that MySQL column sizes are in bytes and
NOT in characters, and a single UTF-8 character takes 3 bytes. In other
words, if you put a 100 character UTF-8 string into the database it will
require 300 bytes, and if it is a size 255 column then BOOM! String too
long error message.

That is why in the entityengine.xml file the default datasource for
MySQL has the char set as a non multi-byte character set.

If you need to do internationalized text OFBiz will handle it great, but
MySQL won't. I'd recommend you use Postgres or something else instead.

-David


> > On May 12, 2009, at 2:59 PM, John Hays wrote:
> >
> >> I am looking at importing thousands of records from an existing
> >> system into OFBIZ.  Our traditional storage charset is UTF8.
> >>
> >> I note that OFBIZ seems to like to use Latin1 as its encoding charset.
> >>
> >> When I load a generated XML file (created using webtools against a
> >> tab delimited file), from the command line, it chokes on any accented
> >> characters in names, etc. (Seems to work better through the form
> >> interface)
> >>
> >> If I move the tables to UTF8, do I break OFBIZ?
> >>
> >>
> >> John D. Hays
> >> Director of Information Technology
> >>
> >>
> >>
> >> www.mavericklabel.com
> >> 120 West Dayton Street
> >> Edmonds, WA 98020-4180
> >>
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: Latin1 encoding

JohnHays
CONTENTS DELETED
The author has deleted this message.