Sunday, 27 June 2010

KanjiVG + HTML5 Canvas = Free stroke animations for all!

I spent half the day yesterday at 梅.py (a python "hack-a-thon" event in Osaka hosted by Accense) writing a 100% HTML5 Kanji stroke renderer that makes use of the excellent open-source KanjiVG dataset. The result is not what I'd call production ready but it still works quite well:

http://nippongrammar.appspot.com/

The KanjiVG XML data is parsed and thrown into a Google appengine data source and a JSON interface is provided to query individual kanji. The queried kanji's stroke data is retuned along with a group of up to 63 other kanji to save on server round-trips (the grouping is static).

My next step is to move the data to static JSON files that can be hosted anywhere and write client-side code to download the appropriate kanji data files directly.

Monday, 19 April 2010

Flash Quirks

IE Reserved Words for Method Names (and lack of debug message support?)

You know Internet Explorer is flakey when naming a method "play" or "stop" causes IE6,7,8 to give JavaScript errors.

A note for those with similar problems, this code cannot be used in IE.
ExternalInterface.addCallback("play", play);
ExternalInterface.addCallback("stop", stop);

In my case, it gave this cryptic error: "オブジェクトでサポートされていないプロパティまたはメソッドです。" On line 48 of some unknown file. It took a little while to figure out what was going on!

Having done almost all of my debugging in Firefox, Chrome and Safari to date, I also didn't realise that IE won't correctly render uncaught Flash exceptions. NPAPI browsers seem to render these fine as popups but IE gives a similar confusing error to the above.

The Windows 7 64-bit SoundChannel Bug

Yet another sneaky little Flash 10.x bug that won't rear its ugly little head until you start using a 64-bit Win7 machine. The flash.media.Sound class creates a flash.media.SoundChannel instance every time you play a sound. An audio player app I made for work creates a new SoundChannel every time the user seeks in the audio file. Internally flash has a limit of  32 concurrent sounds it will mix.

On Win7 though, the mixing channel is not freed when the sound stops. It seems to persist for the lifespan of the SoundChannel instance. This, coupled with an infrequent garbage collection sweep interval makes it possible that all available mixing slots are in use, even though no sound is playing. I can't seem to find a reference to this particular bug elsewhere on the web. Anyone else seen this one? Any workarounds?

Thursday, 15 April 2010

A comparison of IPC methods for Mac OSX

For a project I'm working on, I needed to transfer 6 600x300@30fps video feeds from one process to potentially 6 others. Thats around 31MB/sec. It had to be fast as other processing, compression and decompression was also required simultaneously.
A survey of various IPC techniques was in order and I was surprised I couldn't find any similar comparisons on the web.  I spent about a day running through various options and benchmarking my results in a not-so-rigorous-but-still-quite-useful way. I figured I'd post my results here in the hopes it saves someone a bit of time.

Monday, 29 March 2010

Removing Mosaic's (Part 1)

I had an interesting chat with a friend last week where I mentioned how I thought video mosaic's provided very little security and should't be relied upon to protect the identity of anyone or anything. It was the now famous untwirling case that led to the capture of a paedophile in Thailand that originally got me thinking about this. Anyway, after an hour of chatting I was convinced I could decode at least a subset of mosaic use-cases so I set off on my quest to learn OpenCV. Long story short: Don't use a mosaic to protect someones privacy.

Friday, 12 February 2010

川湯温泉




I spent last weekend at a curious place I'd heard stories about since I first came to Japan. Its an area of Wakayama prefecture where warm, volcanic rock runs underneath the riverbed in such a way that you can literally dig you own onsen (hot bath). Being the middle of winter and all, it was pretty damn cold outside so our digging efforts were soon abandoned in favour of the pre-dug pool 200 metres down river from where we were staying but I can vouch for the fact that there were indeed puffs of gas rising from the shallow rocky river directly outside our hotel window.

The hotel we stayed at (木の国温泉) - a traditional ryokan - was quite old but the food and price were nothing short of incredible. Freshwater fish, nabe, sashimi, gratin, miso soup, ... a banquet fit for kings. For anyone looking for an escape from busy Osaka life and something a bit different, this place is definitely worth considering.


Wednesday, 3 February 2010

The state of sound in Win32

To flesh out my video-conferencing related posts I guess its only fair I have one on sound as well. Windows sound systems to be precise. As I've recently learned, there are *5* different systems that deserve a mention.

Windows Multimedia Extensions (WMME)

This has existed since the dawn of time (Win3.1) and provides a nice fail safe API for sound output. It generally only works for one device at a time unless its emulated (the case for WinXP and up).  The latency you get when using this API makes it completely impractical for video conferencing use.

DirectSound

This used to be part of DirectX and is now part of the platform API (DirectShow?). Apparently provides latency down to around 50ms. On Vista I have seen Speaker-to-Mic sampling latencies range from 60ms to 220ms.

WDM Kernel Streaming (WDMKS)

This API provided the closest access to "bare metal" as possible until Windows Vista. WDM KS gives much better latency than WMME and DirectSound but it generally means opening a device for exclusive access and since you can't steal access to a device from another running application. I don't believe the average user understands the concept that a paused You Tube video is still using their sound device so this API is written off for practical reasons.

ASIO

Steinberg created this API to provide low latency access to audio for professional applications. It solves the latency issues but requires ASIO drivers which only high-end cards seem to have. There is a project called ASIO4ALL but it achieves ASIO support on regular devices through use of the WDM KS API, thus making it no better than accessing WDM KS directly.

WASAPI

This was added in Vista and seems to be the best sound API that Microsoft have come up with. WASAPI internally uses 32 bit floating point samples at a fixed sample rate (determined by the device drivers. typically 48khz). Mixing is done by the OS and tracked per application. In the general case, under Vista and Windows 7, this is the only API that directly accesses the hardware. This might not be the case if using legacy drivers but seems to be true of all systems I've tested audio code on.  Implementation of the other sound API's are emulated and running on top of WASAPI.

(Need to check this is the case for WDM KS.. It may run either/or with WASAPI.)

Tuesday, 26 January 2010

x264's new bells and whistles

When it comes to H.264 encoding, most people know of x264.

Dark Shikari and the x264 team have consistently produced solid, fast and feature-complete code - in many cases better than commercial offerings. This alone is nothing short of incredible considering the budget, geographical distribution and sheer complexity of the encoder itself.

Even though its already one of the best encoders out there, used by the likes of Facebook, Youtube and commercial cable companies, the team seems to show no sign of slowing down. Last week saw a couple of patches committed into the main x264 git repository which have really got my head spinning with (low latency) possibilities.