About Me

My photo

Application Developer who loves writing beautiful code and enjoys learning something new everyday.

Thursday, January 31, 2013

Getting Australian POIs from Sensis API

The App

Over the last X-mas/New Year break, I wrote a simple Web app to search for Australian Points-Of-Interest (POIs), such as toilets, petrol stations, rest areas, picnic grounds, hospitals, etc. Other than petrol stations (considered as "business" listings), the rest are mostly non-business entities.

Here is the URL of the app:

http://aus-pois.appspot.com

This web app works on desktop PCs and tablets with the following browsers:- Firefox, Chrome, Safari or the Android default browser. There is also an Android application Aus Pois that renders the JSON feed from this web app.

The Algorithm

This app gets the POI data from the Sensis API (SAPI). This is the same API that powers the Australian Yellow Pages website. Here's the URL to get the nearest POIs within 2 kms of the location, with Category Id 1007397:

http://api.sensis.com.au/ob-20110511/prod/search?location=-37.58317709893027,145.21661812353523&radius=2&sortBy=DISTANCE_ALL_PAGES&sensitiveCategories=true&categoryId=1007397&includePois=true&key=(SAPI key)

As stated above, petrol stations are considered normal business listings rather than POIs in SAPI, but they can be retrieved together as the real POIs simply by specifying the category ID for "Service Station" in the URL.

The Technology

The Web front-end is done with Google Maps (plus the geometry library), HTML5 (for geolocation and storage) and lots of jQuery goodness.

The server-side logic is implemented on top of Gaelyk, a really cool Groovy framework that allows for almost seamless deployment of the app to Google App Engine.

Monday, July 2, 2012

Moving large data between Oracle database and CSV

I'm currently working on a project that involves importing and exporting large volumes of data between spreadsheet-friendly comma-separated-value (CSV) files and an Oracle database. Although import/export of data can be done using a GUI tool like SQLDeveloper, it is not practical for large volumes or when such jobs need to be scripted. This blog describes how this can be done using BASH scripts and the command-line tools sqlplus and sqlldr.

Scenario

Suppose I run an online shop and maintain a list of purchase transactions with fields such as item_id, customer_id, delivery_status, etc. On my Oracle database, I have PURCHASE_TRANSACTIONS, ITEMS and CUSTOMERS tables with the appropriate joins.

Exporting from Oracle DB to CSV

To export data from the Oracle database table (normally with additional joins and filtering criteria), create a SQL file with:

  • a query to write out the header row of the CSV
  • a query to get the data and concatenate them together in a CSV format

Something like this to export a list of items pending delivery:

set head off;
set feed off;
set trimspool on;
set linesize 32767;
set pagesize 32767;
set echo off;
set termout off;

spool &1;
select 'ITEM_ID,ITEM_NAME,PRICE,CUSTOMER_ID,CUSTOMER_NAME,CUSTOMER_ADDRESS' from dual;
select pt.item_id
|| ',"' || itm.name
|| '",' || itm.price
|| ',' || pt.customer_id
|| ',"' || c.name
|| '","' || c.address
|| '"'
from PURCHASE_TRANSACTIONS pt
join CUSTOMERS c on pt.customer_id = c.id
join ITEMS itm on pt.item_id = itm.id
where pt.delivery_status = 'PENDING'
order by pt.customer_id;

spool off;
exit;

Note the double-quotes around textual data. To execute this, save it to a file named export-pending.sql and write a simple BASH script like so:

sqlplus -S MY_DB_USER/MY_DB_PASSWORD@MY_DB_SERVER:1521/DB_SID @export-pending.sql pending-delivery-report.csv

Running this script will write out the file pending-delivery-report.csv with contents like:

ITEM_ID,ITEM_NAME,PRICE,CUSTOMER_ID,CUSTOMER_NAME,CUSTOMER_ADDRESS
432,"Super Mouse Pad",23.5,11,"John Smith","45 Some Street, Sydney, NSW 2000"
877,"USB Fan",14,11,"John Smith","45 Some Street, Sydney, NSW 2000"
344,"NeverDie UPS",125.3,67,"Alex Jones","101 Scary Road, Melbourne, VIC 3000"
...

This export functionality is typically used to generate spreadsheet friendly reports.

Importing from CSV to Oracle DB

Suppose I have a CSV file items-data.csv containing a new list of items that I want to ADD to the ITEMS table, in the following format:

ID,NAME,PRICE
1001,'Lightbulb',1.95
1002,'Car battery',98
1003,'Super cool camera',450.50
...

To load this into the database, create a control file load-items.ctl with the following:

load data
append
into table ITEMS
fields terminated by ',' optionally enclosed by '\''
(
    ID,
    NAME,
    PRICE
)

Note that in order to REPLACE the entire table's contents with the CSV data, simply change the append instruction (second line) to replace. Finally, create a simple BASH script with the following to run sqlldr:

sqlldr MY_DB_USER/MY_DB_PASSWORD@MY_DB_SERVER:1521/DB_SID control=load-items.ctl data=items-data.csv direct=true rows=1000 skip=1 errors=1 log=load-items.log bad=load-items.bad discard=load-items.discard

After executing this script, check the items-data.log for the progress of the data load. The load-items.bad and load-items.discard files contain the data with errors.

Sunday, July 1, 2012

Mandriva 2011 : the painful way

Last week, I started setting up a new home desktop computer. As all my other home machines were running Mandriva, I decided to install the latest Mandriva 2011 on this new one. Should only take a couple of days to get everything up and running, or so I thought. This blog describes the funpain of actually getting everything to work.

Networking

The new system came with an Asus P8H61 motherboard that had a Realtek RTL8111/8168B Ethernet controller. Sounded like very ordinary hardware that would definitely just work with Mandriva, if not for the fact that the kernel picked the wrong driver! It wrongly loaded the r8169 kernel module, instead of the required r8168. It took me 3 DAYS of messing about :- checking all my cables, reconfiguring my router, updated my kernel version, trying various network settings, etc, before I realized it was all because of this stupid kernel bug.

To fix it, I first had to get the right r8168 driver from the Realtek website. The file I got was r8168-8.031.00.tar.gz2. After unpacking everything and making sure I had the kernel source and headers, I ran:

$ cd r8168-8.031.00
$ ./autorun.sh

which gave me the following error:

WARNING: Symbol version dump /usr/src/devel/2.6.39.4-5.1-desktop/Module.symvers
           is missing; modules will have no dependencies and modversions.

Why happened here? Why would the kernel source not have this file that was required for building kernel modules? I had no choice but to rebuild the whole kernel just to get this file, that I then copied to where the Realtek driver expected to find it:

$ cd /usr/src/linux-2.6.39.4-5
$ make config
$ make
$ cp Module.symvers /usr/src/devel/2.6.39.4-5.1-desktop

After that, the Realtek script worked as expected. It unloaded r8169 and swapped in r8168. Finally, I had network connectivity. To make sure my changes stuck, I modified /etc/modprobe.conf to say:

alias pci:v000010ECd00008168sv00001043sd00008432bc02sc00i00 r8168
alias eth0 r8168

and then added r8169 to the module blacklist files :

$ echo 'blacklist r8169' >> /etc/modprobe.d/blacklist-brcm80211.conf
$ echo 'blacklist r8169' >> /etc/modprobe.d/blacklist-compat
$ echo 'blacklist r8169' >> /etc/modprobe.d/blacklist-mdv

Graphics card

Next challenge:- getting hardware direct rendering working. This system had a low-end (for work, not gaming) 1GB Gigabyte GT210 graphics card running on nVidia GeForce 210.

Understandably, Mandriva had installed the free Nouveau driver by default. What I didn't expect was that installing the nVidia driver didn't actually cause it to replace the nouveau driver. Instead, running nvidia-xconfig simply broke X (thankfully it also backed up the original xorg.conf so I could revert back). Wasted several hours of my life before figuring out I first had to uninstall nouveau, and then re-installed the nVidia rpms below:

dkms-nvidia-current-295.40-2-mdv2011.0.i586
x11-driver-video-nvidia-current-295.40-2-mdv2011.0.i586

Finally, I ran XFdrake one more time and it prompted me to use the nVidia driver.

Monday, December 5, 2011

My highlights of YOW! 2011 Melbourne

I attended the YOW! 2011 Conference in Melbourne on 1-2 December 2011.

The talks


There were many great talks, and some not-so-great. For me, two talks in particular stood out as special. If I only got to listen to these two and nothing else, it would've been worth it.

Product Engineering :- Mike Lee


I first saw Mike Lee's talk on InfoQ several months ago and was really inspired by the passion of that guy, and was so happy to discover he was making an appearance at YOW.

Mike Lee

He didn't impart any novel technical knowledge to us, nor did he tell us about the latest agile technique. In fact, everything he said in YOW was common-sense in hindsight, and virtually identical to the talks on InfoQ. And yet, it was pure pleasure to watch this man share his vision of what constituted a good product. I just did not get bored listening to this same talk for the 3rd time! Maybe it was his costume.

Two things that stood out from the talk:
  • The hook :- the thing that makes a product special
  • Appsterdam :- New IT hub for freedom loving developers


Temporally Quaquaversal Virtual Nanomachine Programming In Multiple Topologically Connected Quantum-Relativistic Parallel Timespaces...Made Easy! :- Damian Conway


Speaking of the hook, the title of this talk did the job. Ever since I watched Fred Simon's JavaPosse lightning talk on Positronic Variables a few months ago, I was curious about the Damian Conway's original work. As luck would have it, Damian came to tell us all about positronic variables.



Damian's Perl (and Rod logic) code was not very familiar to me, but his tongue-in-cheek introduction to quantum physics and the multi-verse was pure entertainment. Personally, I was not convinced that an Einstein-Rosenberg bridge could be harnessed as described for running computer programs due to excessive energy requirements. A computer program using positronic variables would probably be unstable and generate incoherent results due to the unpredictable superposition of multiple variations of positronic values from an indeterminate number of future multi-verses.

Other notable talks


Jim Webber gave his usual talk on RESTful systems. I'd listened to his similarly themed talks before, but it was still entertaining to watch Jim Webber's performance. The one lesson that stuck in my mind was: Use an ATOM feed to publish events so that different clients could independently maintain / reconstruct their own internal states.



Another interesting talk was Dan Ingalls telling us about Lively Kernel. Dan built a simple online load monitoring service simply using drag-on-drop from the Lively Kernel interface... cool.

The games


As usual, there were lots of fun and games between talks. Sensis developed a online (mobile) quiz game, CodeJuicer, that morphed into to be a fun hacking game instead. Too bad I did not have a laptop with me to hack on it all day, though I did manage to hack it to zero-time from home at the end of Day 1. Thoughtworks put up a quiz game and a movie title matching game, plus frozen nitrogen ice cream... all very popular. Atlassian let us shoot Nerf guns... hard to beat that.

The food


First day's food was not too good, but got better on the second day. The Beef Stroganoff was delicious.



The best parts were the free (real) coffee paid for by Aconex, and the desserts, especially the mango mousse.



Looking forward to YOW! 2012.

Monday, December 6, 2010

Personal recollections of the YOW Melbourne Conference

I attended the YOW!2010 Australia Software Developer Conference in Melbourne last week, and here is some recollection of my favourite talks. Please note this is all from memory, so I might some parts wrong.

Main themes


From the sessions I attended, some of the main recurring technical themes in this conference were:
  • Consistency should be traded for Availability in typical distributed systems
  • Eventual consistency over ACID
  • Always expect network failures, i.e. systems must be able to tolerate partitioning
  • NoSQL


Most informative talks


The Rich Get Richer: Rails 3 :- Obie Fernadez


Obie gave a enlightening summary of the events leading to and the people involved in the merging of the Merb and Rails codebases, resulting in the newly released Rails 3.0. The main drivers behind the new release were decoupling, modularity and prevention of code bloat. Many components of Rails 2 had been abstracted out and managed as third-party plug-in projects. He described Rack, Bundler, Arel and the upgrade process from Rails 2.x to 3.0. I look forward to the day when I get the chance to put all these into practice.

Integrated Tests Are A Scam :- J. B. Rainsberger


Having written myriads of tests over the years, it was easy to lose sight of the different roles of tests in different parts of the system. JB gave an excellent refresher on two fundamentals of unit tests : interaction testing and contract testing.


Interaction tests were the "normal" unit tests that I've been writing, where the external dependencies were replaced with stubs/mocks . In the example, he described interaction tests for a client assessing an external supplier. These tests were concerned with the pattern in which the system-under-test (SUT) called out to the external system (mocked out), and how the SUT responded to various return values from the external system (stubbed).


Contract tests referred to tests that verified that an external system satisfied a certain API and expected behaviour, i.e. the contract. In a strongly-typed language like Java, this contract could be abstracted out from the actual implementation of the external system into a public interface, that our code could depend on. In fact, our code would only depend on this public interface, with no knowledge of the implementation. In a strongly-typed language, any implementation of this interface would then satisfy the minimum requirements of the contract, therefore, the real implementation could be replaced with a stub. As long as the stub passed the contract tests, it would in theory make no difference to our code. Hence, we could then write integrated tests for our SUT that used the stub instead of the real implementation, which would make such tests extremely fast.

Personally, I was not too convinced that interaction and contract tests could fully replace the need for integrated tests against real external systems. There would always be unexpected surprises lurking in system configuration, and some behaviour could not be adequately captured by the abstracted interface. Furthermore, in dynamically typed languages, it would be much harder to enforce that the stub fully implemented the public interface.

Designing and Implementing RESTful Application Protocols :- Ian Robinson



Ian walked us through an example from his new book, REST in Practice, to illustrate the interactions of a client program ordering coffee beans from a coffee supplier via a RESTful interface.

The client first asked the server for a list of allowed operations. After that, it requested a quote of available product prices. It then placed an order by resubmitting all the pricing data it had previously received back to the server. In this way, the client could maintain conversation state over a stateless protocol. A checksum of the data was used to prevent price tampering by the client. Finally, the server immediately returned a HTTP 202 response, before kicking off the long-running order fulfillment process asynchronously.

Ian explained several basic concepts such as the use of media types, link relations and XForms in driving the process flow.

I found Ian's talk to be clear and concise, therefore I'm putting his book on my wish-list.

Most interesting talks



Exploring NoSQL :- Erik Meijer


This talk started a bit a slowly with an (mostly unecessary) explanation of how an object graph was represented in memory using pointers. This type of representation was embodied by noSQL, whereas SQL embodied a fully normalized relational view of data.

Things became interesting when Erik started describing Category Theory. Erik reasoned that because the direction of parent-child relationships was reversed between SQL and noSQL, they were duals of each other according to Category Theory. In other words, noSQL was actually "coSQL", i.e. for any statement that was true of SQL, the dual-statement (opposite meaning) would hold true for noSQL. Some examples of this duality were:




SQLnoSQL
The identity is embodied in the row, i.e. primary key (extensional)Identity of an object depended on other objects in the environment (intensional)
Closed-world view of dataOpen-world view of data
Synchronous operationsAsynchronous operations


A very interesting concept indeed. Furthermore, this talk had the most obscure slide ever...



Computational Information Design :- Ben Fry


This talk was worth attending just for the visual candy itself. Ben talked about data visualization using his Processing programming language (IMO really a DSL over Java 2D/3D). Some of the examples shown were really amazing, such as:
  • Mapping the program flow in old cartridge-based video games
  • Tracking changes to six editions of Charles Darwin's work
  • Comparing the human genome with other mammals



Ben then introduced Processing.js that allowed code written in Processing to be run on any HTML5 compatible browser. People could simply go to sketch.processing.org to write and run Processing code all within a Web browser. What was even more exciting was a port of the Processing runtime to Android, which meant I could run Processing programs on my smartphone. Super cool!


Workshop


For me, the best part of the conference was the iPhone workshop. Having done some Android development, I was eager to see how things worked on the iPhone side. In this half-day workshop, I learned some basic syntax of Objective-C, basic usage of xcode IDE, some iPhone frameworks, and finally put together a simple iPhone application. My verdict of Objective-C :- dynamic typing and late binding were nice, but the extremely verbose syntax really sucked. I guess I'll stick with Android for now.