Mirko Caserta

An Introduction to Time Representation, Serialization and Management in Software

Most issues in software development usually arise from poor, inconsistent knowledge of the domain at hand. A topic apparently as simple as time representation, serialization and management can easily cause a number of problems both to the neophyte and to the experienced programmer.

In this post, we’ll see that there’s no need to be a Time Lord to grasp the very simple few concepts needed not to incur into time management hell.

Representation

A question as simple as “What time is it?” underlies a number of contextual subleties that are obvious to the human brain, but become absolute nonsense for a computer.

For instance, if you were asking to me the above question right now, I might say: “It’s 3:39” and, if you were a colleague in my office, that’d be enough information to infer that it’s 3:39pm CEST. That’s because you would already be in possession of some bits of important contextual information such as

  • it’s an afternoon because we’ve already had lunch
  • we’re in Rome, therefore our timezone is Central European Time (CET) or Central European Summer Time (CEST)
  • we’ve switched to daylight savings time a few weeks earlier so, the current timezone must be Central European Summer Time

3:39 only happens to be a convenient representation of time as long as we’re in possession of the contextual bits. In order to represent time in an universal way, you should have an idea what UTC and timezones are.

Now, suppose I have to schedule a skype chat with a fellow software developer in the US. I could write him an email and say something along the lines of “see you on 2/3”. In Italy, that would be the second day of the month of march, but to an US person, that would be the third day of the month of february. As you can see, how our chat is never going to happen.

These are only a few examples of the kind of issues that might arise when representing date and time information. Luckily enough, there is a solution to the representation conundrums, namely the ISO 8601 standard.

Just to give you an example, in ISO 8601, 1994-11-05T08:15:30-05:00 corresponds to November 5, 1994, 8:15:30 am, US Eastern Standard Time. 1994-11-05T13:15:30Z corresponds to the same instant (the Z stands for UTC). Same instant, different representations.

The ISO 8601 standard also has the nice side effect of providing natural sorting in systems that use lexicographical order (such as filesystems) because information is organized from most to least significant, i.e. year, month, day, hour, minute, second, fraction of second.

Even if you’re only dealing with local times in your software, you should know that, unless you also display the time zone, you can never be sure of the time. I cannot remember how many times a developer has asked me to fix the time on the server, only to discover that his software was printing time in UTC.

At display time, it is okay to deal with partial representation of time because the user experience requires so. Just make sure, when debugging, to print out the whole set of information, including the time zone, otherwise you can never be sure what you’re looking at is what you actually think it is.

Although a given moment in time is immutable, there is an arbitrary number of ways to express it. And we’ve not even talked about the Julian or Indian calendars or stuff like expressing durations!

Let me summarize a few key points to bring home so far:

Serialization

Speaking of software, serialization is a process where you take an entity and spell it out in such a way that it can be later entirely rebuilt, exactly like the original, by using the spelt out (serialized) information.

In the binary world of computers, time is usually serialized and stored by using the Unix time convention. As I’m writing this, my Unix time is 1366191727 UTC. That is: 1366191727 seconds have passed since January 1st, 1970 at 00:00 UTC. Isn’t that a pretty clever, consistent and compact way of representing a plethora of information, such as April 17 2013 @ 11:42:07am CEST?

Unix time is only another arbitrary representation of a given moment in time, although a not very human readable one. But you can take that number, write it on a piece of paper, stick it to a carrier pigeon, and your recipient would be able to decipher your vital message by simply turning to the Internet and visiting a site such as unixtimestamp.com.

Just like you can write that number on a piece of paper and later get back the full instant back to life, you can store it in a file or a row in your favorite RDMBS. Although you might want to talk to your RDBMS using a proper driver and handing it a plain date instance; your driver will then take care of the conversion to the underlying database serialization format for native time instances.

By storing time using a native format, you get for free the nice time formatting, sorting, querying, etc features of your RDBMS, so you might want to think twice before storing plain Unix timestamps.

Just make sure you know what timezone your Unix timestamp refers to, or you might get confused later at deserialization time.

ISO 8601 is also a serialization favorite. In fact, it is used in the XML Schema standard. Most xml frameworks are natively able to serialize and deserialize back and forth from xs:date, xs:time and xs:dateTime to your programming language’s native format (and viceversa). Just be careful when dealing with partial representations: for instance, if you omit the time zone, make sure you agree beforehand on a default one with your communicating party (usually UTC or your local time zone if you’re both in the same one).

Management

First of all, if you think you can write your own time management software library, or even write a little routine that adds or subtracts arbitrary values from the time of the day, please allow me to show you the source code for the java.util.Date and java.util.GregorianCalendar classes from JDK 7, respectively weighting 1331 and 3179 lines of code.

Okay, these are probably not the best examples of software routines that deal with time, I agree. That’s why Java libraries like Joda Time were written (so you don’t have to)! In fact, Joda Time has become so popular that it gave birth to JSR-310 and is now part of JDK 8.

Use of popular, well designed and implemented time frameworks will save your life. Seriously. Take your time to get familiar with the API of your choosing. If you are into Scala, I can recommend nscala-time. And if you are into other programming languages, honestly, I have no idea, but ask your local guru: she can help.

Further Resources

Here are a few useful links I’ve accumulated over time:

OS X Launchpad Clean Up

I quite like OS X Launchpad. The only problem is it gets quite messy after a while. The way I fix it is by spraying napalm over its cache and restarting the dock:

Clean up Launchpad
1
$ rm ~/Library/Application\ Support/Dock/*.db && killall -KILL Dock

I use this command so often that I have an alias for it:

Clean up Launchpad shell alias
1
$ alias culp='rm ~/Library/Application\ Support/Dock/*.db && killall -KILL Dock'

The alias command should work both in bash and zsh. My mnemonic for culp is Clean Up Launch Pad.

SBT Company Wide Settings Example

I know it looks like I haven’t been posting much lately. Anyway, I’m studying Scala programming and, among other things, I came upon the awesome Simple Build Tool (SBT for short).

SBT looks like a mighty and untamable beast at first glance. But, after a closer look and a patient walkthrough of the online docs and the practical examples, I must say it is an awesome tool, very worth the learning effort.

One of the first issues I had to face in SBT was implementing what I was previously doing with a Maven Corporate POM. In other words, I needed a mechanism to control the master build settings for all of a company’s artifacts.

I ended up publishing an SBT Company Wide Settings Example project on GitHub. There is an extensive readme that explains how I am doing it.

As usual, feedback is welcome.

Hello Scala World!

I’ve just published an Hello Scala World! project on GitHub that you can use to quickly setup a Scala hacking environment. This is particularly useful if you run Intellij Idea with the Scala plugin since simply opening the pom as a project will get you all the comforts you would expect in a modern ide.

You also get support for running specs2 specifications.

Video Tutorial

Dallo scorso settembre faccio parte di un gruppo interno all’azienda per cui lavoro in cui ci occupiamo di architettura del software. Uno dei nostri compiti è quello di individuare strumenti e tecnologie su cui standardizzare il processo di produzione del software e contribuirne alla diffusione.

Uno dei modi per diffondere innovazione è rappresentato dai nostri video tutorial. Ora non saremo né gli Oliver Stone dei video tutorial, né gli Alan Turing dell’informatica ma, per quanto ne sappiamo, non ci sono altre aziende in Italia che pubblicano gratuitamente video in cui vengono spiegati tecnologie e strumenti realmente impiegati nel lavoro quotidiano di produzione di software professionale. Personalmente credo sia una iniziativa meritevole se non altro di maggiore pubblicità sui social media.

Se guardi uno dei nostri video e lo trovi utile, segnalalo ai tuoi amici, colleghi, a voce, sui social network, ecc.

Per praticità, ecco un elenco dei video tutorial pubblicati finora:

Buona visione.

Clustering Issues?

The number one culprit for a non working cluster is a misconfigured /etc/hosts file. This is because of how some software implementation announces its availability on the network. Typically, if you have a resolver configured so that the domain name of your local node points to the loopback address, you have a problem.

Some Linux distributions (Ubuntu, I’m looking at you!) ship with an /etc/hosts that looks like this (supposing the hostname is cat and the domain name is foo.bar):

/etc/hosts
1
2
127.0.0.1    localhost
127.0.1.1    cat.foo.bar cat

For some reason, a few software implementations use the resolver to infer the address of the node they’re running on, then start sending out messages such as “Hey, I’m available. You can find me at 127.0.1.1”, which is the problem because you want the node to advertise its availability with a real ip address (usually the one that is bound to the main network interface).

The solution is to get rid of the 127.0.1.1 line. You’re welcome.

Spring Crypto Utils 1.3.0

I’ve released a new version of Spring Crypto Utils. The project now has a gorgeous, shiny, sleek, new website on the springcryptoutils.com domain. The website is made possible thanks to the kind folks at GitHub and their GitHub pages hosting facility.

Please read the notes in the changelog if you’re upgrading.

I must say moving to GitHub was a smart move since contributing to the project is very much easier now thanks to git pull requests. I would like to thank Chad Johnston and Martin Bosak for their contributions: respectively, the certificate element and the provider attribute.

I hope users will find the documentation easier to read. Here is an example regarding digital signatures with runtime selection of multiple keys.

Carl Sagan, Lo Scetticismo E l’Ingegneria Del Software

Ieri sera leggevo questo articolo in cui viene citato Carl Sagan, probabilmente il migliore divulgatore scientifico mai vissuto sul pianeta Terra.

Sagan dice, con parole molto più efficaci delle mie, che esiste un delicato equilibrio fra scetticismo ed apertura mentale nell’affrontare ipotesi e idee. Leggendo il suo ragionamento mi sono reso conto che questa faccenda del giusto equilibrio è applicabile a tanti campi della conoscenza umana, fra cui l’Ingegneria del Software.