Wednesday, 15 February 2012

NSW Property price heatmap

I've been considering buying a property of late but without a strong grasp of the greater Sydney area's geography I found I couldn't really judge whether a price was appropriate or not for a given suburb. Seeing how much as I love data, I threw together Property Hot Spots . Its backed by couchdb and aside from the initial tile generation, is essentially static. It could do with a design cleanup and perhaps a canvas-based heatmap that isn't so ugly but it served its purpose so I figured I'd post it here in case others find it useful. The newest data displayed here is from Jan 2012. The oldest a few years back. I might add date of sale somehow in a future change if I can find an uncluttered way to display it. :)

Friday, 10 February 2012

How to read Japanese books above you Kanji level

After moving back to Australia, my kanji has gone seriously downhill. Some friends from Japan recently brought me over the 3 books of Haruki Murakami's 1Q84 for me to study with and I quickly found I was utterly useless on my own without a dictionary.

Its a chicken-and-egg problem that I presume many other scholars of Japanese also have. Books with simple kanji are often targeted at younger audiences and can be painfully boring for adults. Books targeted at adults assume a kanji level much higher than I currently have. The end result is you get bored by either the content you're reading or the constant turning to your dictionary.

So my new Japanese study pipeline involves:

  1. Scan a chapter at 300 DPI greyscale.
  2. Run OCR over it.
  3. Correct any errors. (I am seeing about a 95-98% accuracy rate so this is quick.)
  4. Save the text as UTF-8 HTML.
  5. Use Rikai-kun as necessary to get super-quick dictionary lookups when you need them.
This process is not as involved as it sounds. Its probably about 30-60 seconds per page on average amortized across all the conversion tasks. I would have spent MUCH longer than this fumbling with my dictionary if I'd tried to read it the traditional way. 

Also as a bonus, I am now building up a personal digital copy I can carry with me much more easily than the 3 hard cover books on the shelf!