Tuesday 9 September 2014

Receipt scanner

Ever since I started living on my own over than a year ago, I've wanted to somehow keep my income and expenditure digitally. I used to keep all receipts from groceries so that I could put this in a computer. But typing all relevant data from every single receipt by hand is waaaaay too much work. I did start an Excel document in the beginning, adding only the totals but this again was too much work. I had to open Excel to modify the document every single time I'd go to a store, which is very impractical. But as an Artificial Intelligence student, I knew I could do better. There had to be a way to scan a receipt and put the data directly into some kind of graph.

Don't stare directly at the noise - this image was taken with my recently re-repurchased Sony Ericsson Satio out of ease.

Now I've finally found a way to easily do the scanning part with my phone. It requires three applications: Tasker, Google Goggles and Dropsync (assuming you already have a Dropbox account). I had noticed Google Goggles can scan pieces of text and lets you copy the text to the device's clipboard. That's where the fun started. In case you're unfamiliar with Tasker, it basically allows you to program your phone to perform tasks whenever a user define state or event (profile) takes place. These profiles and tasks are fully customisable, enabling you to do essentially anything with your phone. From changing your phone's brightness when you're connected to your home Wi-Fi network, to reading the calendar events out loud for the day when you wake up in the morning - creativity is the limiting factor, not the application itself.

I set up Tasker to show a persistent notification as long as the Google Goggles application was running in the foreground. If you press the notification, Tasker performs a task where the contents of the clipboard are pasted in a new file, "receipts.xml" (I chose .xml instead of .txt as the latter didn't correctly show newlines). The last part of my 'scanner' makes sure the scanned receipt files are sent to my computers. For this I used Dropsync, which automatically detects changes to specified files to synchronize with Dropbox (Dropsync is basically the desktop synchronization version of Dropbox for phones, instead of the online non-synchronizing one from Dropbox themselves). What I get is a scanner where I open Google Goggles to take a picture of a receipt, copy it to the clipboard, press the notification which is only then present and magic makes the digital versions of the receipts appear in a Dropbox folder across all my devices!

Oh, how much I love Tasker...

This is only the start however. It's the start I needed for collecting data, sure enough important as physical receipts decade over very little time. But the most important part is always data manipulation. As people like to say, we live in an era where Big Data is a thing. There is much more data than we can use as of now, not because we don't want to, but because we don't have the resources to manipulate all of it. For me and my Little Data, I will soon start trying to program .xml data manipulation in C, if this is at all possible. Unfortunately, C is the only programming language I know and I cannot afford to learn a new language with my study being rather demanding. My goal for the receipts program is to manipulate the raw text to sort the receipts on their exact date and time, and to put the products and associated price tag in a statistically useful list. Here's where my current course, Statistics in R, comes in handy (yup, R is the second programming language I'll master, after I master C)!
(No I don't know why they use single letters for their languages, except that programmers are usually extremely f*cking lazy which might be the cause here)

No comments:

Post a Comment