Strip /uXXXX From String and Replace it With the Correct Unicode Character
About a month ago, when reading DBPedia data into a database, I discovered ‘/uXXXX’ appearing where pretty unicode characters should be within my strings. The strings were to be compared to … other strings, which would have the proper unicode characters, so I had to replace the ‘/uXXXX’ in my strings. I couldn’t find a class to do this, but found enough information to understand what needed to be done.
The below function is what I came up with.
/** * Strips /uXXXX from a string and replaces it with the correct unicode character (for example: '\u1E09') * * @param slashed string containing '/uXXXX' to be replaced with their Unicode characters * @return Unicode string with '/uXXXX' converted into Unicode. * @author Michael Robinson mike@pagesofinterest.net */ public String unslashUnicode(String slashed){ ArrayList<String> pieces = new ArrayList<String>(); while(true){//while there is /uXXXX in the string if(slashed.contains("\\u")){ pieces.add(slashed.substring(0,slashed.indexOf("\\u")));//add the bit before the /uXXXX char c = (char) Integer.parseInt(slashed.substring(slashed.indexOf("\\u")+2,slashed.indexOf("\\u")+6), 16); slashed = slashed.substring(slashed.indexOf("\\u")+6,slashed.length()); pieces.add(c+"");//add the unicode } else{ break; } } String temp = ""; for(String s : pieces){ temp = temp + s;//put humpty dumpty back together again } slashed = temp + slashed; return slashed; }
Note that my strings only ever contained unicode slashed as ‘/uXXX’, never as ‘/UXXXX’. The above class, therefore, will need some modification if it is to be used with capital ‘u’ slashed unicode characters.
Like this post? Move it on along with:
Email |
delicious |
Digg |
Tweet |
Reddit |
Newsvine |
Furl |
Google |
Stumble |
HaoHao
| Trackback: |
Related posts:
- URL, Base64, Character, XML and ECMAScript Conversion Scroll to comments This tool helped me convert rubbish characters (e.g. â«) to proper strings when I accidentally broke my Wordpress Database’s collation. It also converts URL, Base64, XML and ECMAScript strings. In short, it is a lifesaver. Coder’s Toolbox – Online String Converter. Like this post? Move it on along with: Email | [...]...
- Perl Script to Insert DBpedia Infobox Data into a MySQL Database This script parses out the Wikipedia page, DBPedia Infobox Predicate and Infobox subject, and inserts them into a MySQL table. I thought I'd share it with The Internet in case someone else wanted to work with DBPedia infobox data in the same way....
- Find and Replace Text Within Multiple Files in Linux – Avoid RSI After updating 100+ pages manually, I realized that I had neglected to add "index.php" to the end of certain links. Usually this would be fine, but the links in question are opened in Shadowbox, which will fail on pretty, "index.php"-less links....
- Return an NSMutableString as NSString Avoiding “Uncaught Error 11″ with Cocoa Scroll to comments Another stumbling block on the road to Slider completion was this: NSUncaughtSystemExceptionException — Uncaught system exception: signal 11 This vague and unhelpful error message (in this case) was caused by my trying to return an NSMutableString in place of an NSString: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16...
- Using PNG Transparency + the jQuery Colour Change Plugin A tutorial describing how to achieve the "colour change on roll-over" effect used in this site's RSS and Twitter links in the sidebar....





























Recent Comments
Arrived In Shanghai
Done, look left!
Sun, 03 Jan 2010 15:37:03 +0000
Arrived In Shanghai
Hi :)
I'll see what I can do!
Sun, 03 Jan 2010 14:23:07 +0000
Arrived In Shanghai
Hi Mike. It would be very useful to have the time date and weather conditions in Shanghai, on your site. Be seeing you soon. Love NZMum.
Sat, 02 Jan 2010 23:30:41 +0000
Js Kit Comments Correct Usage Of The Permalink And Path Attributes
I think it could be better
Thu, 31 Dec 2009 09:01:44 +0000