Željko Filipin's blog.
Home Blog Tags Now License

View on GitHub
15 March 2006

UTF-8 and Ruby

by Željko Filipin

I use Watir to test web applications.

At application under test, after user is created I need to get her or his database id. I use it later to check if script is at the right page. For example, view user page url is


I have to know user database id to check URL.

A developer made XML page with all users. Only username and database id are displayed.

Script goes there and grabs database id.

require "watir"
require "rexml/document"
username = "Aragon"
ie = Watir::IE.start("{application_under_test}/users.xml")
root =
db_id = root.elements["item[@username='#{username}']"].attributes["id"]
puts db_id

And it works just fine.

Then, we imported data from production database. And my script returns this error:

C:\ruby\lib\ruby\1.8/rexml/parsers/treeparser.rb:85:in `parse': #
Last 80 unconsumed characters:

There is user with username Aragón, and my script has a problem with it.

The solution? Install Ruby 1.8.3. or 1.8.4. and put this at the begging of your script:

require "win32ole"
WIN32OLE.codepage = WIN32OLE::CP_UTF8
tags: code - ruby