Strip /uXXXX From String and Replace it With the Correct Unicode Character
About a month ago, when reading DBPedia data into a database, I discovered ‘/uXXXX’ appearing where pretty unicode characters should be within my strings. The strings were to be compared to … other strings, which would have the proper unicode characters, so I had to replace the ‘/uXXXX’ in my strings. I couldn’t find a class to do this, but found enough information to understand what needed to be done.
The below function is what I came up with.
/** * Strips /uXXXX from a string and replaces it with the correct unicode character (for example: '\u1E09') * * @param slashed string containing '/uXXXX' to be replaced with their Unicode characters * @return Unicode string with '/uXXXX' converted into Unicode. * @author Michael Robinson mike@pagesofinterest.net */ public String unslashUnicode(String slashed){ ArrayList<String> pieces = new ArrayList<String>(); while(true){//while there is /uXXXX in the string if(slashed.contains("\\u")){ pieces.add(slashed.substring(0,slashed.indexOf("\\u")));//add the bit before the /uXXXX char c = (char) Integer.parseInt(slashed.substring(slashed.indexOf("\\u")+2,slashed.indexOf("\\u")+6), 16); slashed = slashed.substring(slashed.indexOf("\\u")+6,slashed.length()); pieces.add(c+"");//add the unicode } else{ break; } } String temp = ""; for(String s : pieces){ temp = temp + s;//put humpty dumpty back together again } slashed = temp + slashed; return slashed; }
Note that my strings only ever contained unicode slashed as ‘/uXXX’, never as ‘/UXXXX’. The above class, therefore, will need some modification if it is to be used with capital ‘u’ slashed unicode characters.
Like this post? Move it on along with:
Email |
delicious |
Digg |
Tweet |
Reddit |
Newsvine |
Furl |
Google |
Stumble |
HaoHao
| Trackback: |
Scroll to post title



























Recent Comments
Js Kit Comments Correct Usage Of The Permalink And Path Attributes
http://store.taobao.com/shop/view_shop.htm?asker=wangwang&shop_nick=a333b444
http://item.taobao.com/auction/item_detail-0db2-40a79e949a57400b6b96edd149670677.htm
http://item.taobao.com/auction/item_detail-0db2-1d66cafb681edb96c634b894d2b1df3f.htm
http://item.taobao.com/auction/item_detail-0db2-a37ac99a003784d0c9b38150416d4c58.htm
http://item.taobao.com/auction/item_detail-0db2-f84bf08e8c573335b7d7ffe73a9d611b.htm
http://item.taobao.com/auction/item_detail-0db2-61c03a65b354488484ea0ec3076785b2.htm
http://item.taobao.com/auction/item_detail-0db2-e61b6c352c15689ee7eeca843f5345da.htm
http://item.taobao.com/auction/item_detail-0db2-958a432e957ab5c9e9b4fd6326a6ff55.htm
http://item.taobao.com/auction/item_detail-0db2-1d66cafb681edb96c634b894d2b1df3f.htm
http://item.taobao.com/auction/item_detail-0db2-484419d0d030152d16dd9dd2039c94c2.htm
http://item.taobao.com/auction/item_detail-0db2-43831f4a2dc611bccd1024ca3cf2b5f9.htm
http://item.taobao.com/auction/item_detail.htm?item_num_id=4955248190
http://item.taobao.com/auction/item_detail.htm?item_num_id=4957300818
http://item.taobao.com/auction/item_detail.htm?item_num_id=4957109628
Tue, 11 May 2010 15:02:41 +0000
Js Kit Comments Correct Usage Of The Permalink And Path Attributes
http://a333b444.taobao.com
Tue, 11 May 2010 14:55:51 +0000
Js Kit Comments Correct Usage Of The Permalink And Path Attributes
GOOD
Tue, 11 May 2010 14:46:49 +0000
Installing Jdownloader In Ubuntu
Very nice tutorial, works like charm, ty very much
cheers!
Fri, 02 Apr 2010 10:58:11 +0000
Js Kit Comments Correct Usage Of The Permalink And Path Attributes
http:www.qq8080.com.cn
Fri, 02 Apr 2010 06:49:22 +0000
Js Kit Comments Correct Usage Of The Permalink And Path Attributes
http://www.ioiojewelry.com
Mon, 29 Mar 2010 04:02:13 +0000