Saturday, September 29, 2012

Unix timestamp vs Java timestamp

A recent incident which occurred to me and a colleague of mine made me write this little post. We were setting a property in a hive script and was trying to retrieve some data comparing with the value we set. In other words, this is what we did.
  1. We set a property named max_ts to hive context within a Java class. We used the Timestamp.getTime() method to get the time.
  2. In the hive script, we were doing something like:
    1. Select my_columns from my_table where unix_timestamp(a_string) > ${hiveconf:max_ts}
 But this was not seem to be working. Even when we had records which has a timestamp greater than the max_ts, query was not returning them. Then we added some debug logs and realized what was going wrong.

Unix timestamp gives the number of seconds which has passed since the epoch.
Java timestamp gives the number of milliseconds since the epoch.

Although both of these uses the same epoch (midnight of January 1st 1970), they provide two different values. Thats why the query was not behaving as intended. Although I had read/learned about this difference, it took some time to kick it into the mind.
  

No comments:

Post a Comment