Dev Log: Spend Online

One of my projects at the moment is creating a small web app that interfaces to the Spend iPhone Application by Adamcode. The app's a 59p bargain which simply lets you assign expenses to budgets while on the move, and so keep an eye on how fast it's flowing out. I'm using it to try and improve the amount of money I can pay into my holiday and savings funds.

The app itself is nicely designed; the key thing is that it's simple, usable and uncluttered. In which vein, a picture paints a thousand words, so here's the interface:

To assign an expense to a budget, you simply tap the budget and add an amount and description. The main interface then shows the amount you've spent or have left; budgets within range are green, those overspent are red. Budgets can refresh at various frequencies and optionally roll over.

This is great; it gives a great "at a glance" view, but not much more. It doesn't give me historical data, trends, charts, or summaries of how well I generally adhere to my budgets.

And, frankly, I really don't want it to. It'd clutter up the app, and would most likely be hard pressed to display enough controls or data to be useful within the confines of the iPhone interface. What Spend does, it does well.

What it doesn't do, I can do on my desktop.

The iPhone's data export abilities are incredibly limited, but Spend lets you drop CSV data to an email and send that to yourself. The core aim is that you can then copy that into a .csv file and load that into a spreadsheet. It's a good solution to the iPhone's limitations, but trying to manage sequential updates this way is bound to get trying; it's slow and too manual, and stitching together sequential and overlapping data and deletions is awkward.

So, what's a web software engineer to do? Well, write a website obviously. When the only tool you have is a hammer...

Seriously, though, the web solution's a good one. Sites can easily accept data input by email, and PHP's great for the data management. Add SVG charts and we have all we need.

So, the plan (version 1) is:

- Mail data to a robot address
- Create user accounts for any new source email addresses.
- Notify users by email.
- Stick data in a database
- Analyse data in various formats
- Spend less money
- Profit!

Since it's almost as easy to write a web app for everyone as for one person (if you code securely and sanely), I figured I'd plan for wide usage. It's an interesting challenge and also a chance to do something useful.

The format of the incoming mail though provides some challenges:

- Only expenses are exported, not the budgets they're accrued against.
- There's a slight data bug with regards to commas in text not being properly escaped in CSV.
- Dates are exported according (as far as I can tell) to the individual iPhone's locale settings, and there's no way of telling whether 6/7/2008 is an American iPhone sending data from June 7th, or a UK one sending it from July 6th. Assume the wrong one and your data's a mess.

There are various approaches that can be used to deal with these:
- Manual budget entry into the web app (budgets change slowly, expenses are several-per-day)
- Potentially merge any fields displaying too many commas
- Code for UK dates first and worry later
- Hope for import periods of at least 2 weeks, which would guarantee at least one date that'd be invalid if days and months were reversed.
- Do other heuristics on import data to check order.

It's worth noting that I'm coding this test-driven, so I'll have strong regression tests, a modular design, and the ability to improve and optimise as I go. However, I'd still rather not working on code that guesses what it's doing.

So, to keep it simple, I'm handling UK dates for now, but I also got in touch with the developer of the app to raise the actual bugs, suggest Budget export and let him know my plans. Adam was polite and responsive, and took the suggestion on board even before I started coding.

This week I've got some time off work and have started coding. The first part needed was the input mail parser, which:

- Reads in the whole email as text.
- Identifies the sender.
- Identifies the time range covered.
- Collects up and parses the CSV rows

It doesn't try and store this data, or catch the incoming emails; instead it provides the parsing function that another module can use to do so. That encapsulation makes testing easier and also works well with the brief periods of time I have available to code.

It also lets me test my assumptions as I code, because every class and function I code can be run almost immediately.

Some of these early assumptions included:

- Users will send emails of various expenditures at various times.
- These emails will overlap in the time they cover.
- Users may correct or delete items they've already sent.

This means I need to be able to uniquely identify transactions that may change name, amount and budget. The logical solution is therefore to use the timestamp (per user) as the UID - after all, you can't create two transactions at the same moment.

Well, early development points out that you actually can. Timestamps are only granular to the minute, and if you go to three shops and then update Spend after lunch, the very slick interface means you can enter at least two of them within the same minute.

Oops. No UID. And since the email export's main purpose is human usability, and I'm only one user of many, it doesn't seem very sensible (or polite) to ask Adam to recode the export (although a generic "spreadsheet optimised" output may have its uses, I'd rather not need it if I can avoid it). That said, the iPhone's update mechanism does mean that users can generally get the latest version of the app pretty easily.

But does it matter if we don't have an exact, causal UID? Possibly not. If we get two synchronous transactions we can increment one by a second. The update mechanism means we get all transactions for a minute or none of them, so we can handle time scoping and updates even if we shuffle which means which - so long as we're not trying to attach unique data such as comments. For that, we'll need a guaranteed UID.

The mechanism for handling incoming emails is then as follows:

- Identify the time range a new import covers
- Identify the sender of a new import
- We need to store all transactions which don't already exist for this time/account pair
- Delete any between this start and end date that don't exist in this import
- Update any already in the DB

So far, that all works. The main problem so far is the lack of specificity in dates; for that, export metadata may be needed, or a transition to YYYY/MM/DD format, which is unambiguous.
Posted by parsingphase, 2008-10-22 19:59

Anonymous user



Contact Richard