## Paul Johnson ## May 24, 2016 ## About subtracting dates and times ## Importing date time values from Excel. ## The excel file to accompany this can be found here ## http://pj.freefaculty.org/scraps/date-20160524.xlsx ## along with this file, which is named ## http://pj.freefaculty.org/scraps/date_time_from_excel.R library(openxlsx) ## I have read these manuals many times and did not understand ## until I worked on the openxlsx example here. I don't know if ## that means openxlsx helped or just that I tried often enough ## to understand. ## Dates and Times ## If data is only dates, but not times are associated, then an R ## "Date" object is called for. ## Concentrate on the Date problem first. ## Today's date z <- Sys.Date() z ## What will the date be 6 days from now? z + 6 ## What will date be 112 days from now z + 112 z2 <- z + 112 ## Subtracting dates is meaningful z2 - z ## If the data includes only dates, no times, then ## it is not actually meaningful to look for difference in minutes. ## But you can if you want: difftime(z2, z, units = "mins") difftime(z2, z, units = "secs") ## See ?difftime ## Now, about importing Date objects from Excel. ## I made a test example Excel sheet that has a couple ## of date columns, and then one string column that ## looks like a date, then 2 that have the Excel time ## in them. The openxlsx import can try to convert dates ## for us, and I have that first as example here, but ## I did not truly understand what was going on until ## I ran the second example, where I did not let openxlsx ## convert date/time things automatically. ## As you test this, remember to run str() frequently to ## see what the data has. ## About data in put in Excel. When setting Format Cell in Excel, I ## notice that there is one option for "date" and one for "time". ## Appears to me the time format is just for for hours:min:sec and Excel has ## date formats that might be only days (04/01/2016) but can also times ## have times added in (04/01/2016 2:00 PM) or (04/01/2016 14:00) ## So the Excel date element is possibly more difficult to be sure ## about. ## If the R import process thinks the input is just at date, ## with no time information, it will try to use the Date class. ## However, if it might have times associated, then a more complicated ## data storage is needed. ## If there are times as well, then a more complicated data ## Storage is needed. There are 2 options, POSIXct and POSIXlt. ## POSIXct: seconds since 1970 (UTC time zone) as number. ## POSIXlt: a named list of vectors, ## 'sec" ## 'min" ## 'hour' ## "mday" (day of the month 1-31) ## 'year' ## 'wday" 0-6, day sunday - sat ## 'yday' day of the year ## 'isdst' daylight time saving flag ## ## ‘"POSIXct"’ is more convenient for including in data frames, and ## ‘"POSIXlt"’ is closer to human-readable forms. ## ## One final detail. If date is saved as a character string, as something ## like "2016-05-22" or "05/22/2016" there is a way to convert that to ## an R Date object. Since column x3 in the Excel sheet is a string, ## that is demonstrated below. Look for "strptime" dat1 <- read.xlsx("date-20160524.xlsx", detectDates = TRUE) summary(dat1) ## Excel sheet already has columns 1 and 2 dates. ## Days difference betwen col1 and col2 dat1$x1 - dat1$x2 difftime(dat1$x1, dat1$x2, units = "mins") ## Note x3 is just a text variable. This converts it ## a structure POSIXlt dat1$x3.2 <- strptime(dat1$x3, format = "%Y/%m/%d") ## Note because there were no minutes or hours, the hour:min:sec info is all 0'2 ## The x3.2 is a more powerful structure, it is POSIXlt, which ## means it can store hours:minutes:seconds as well. ## Try this, you will get an error because x1 has no hour ## information in x1 ## dat1$x1$hour ## On the other hand, has more flavor. Because it is POSIXlt, ## it has "hour" "min" "sec" and the others. Unfortunately in this ## example, they are all 0 because the input time was only dates, ## dat1$x3.2$hour dat1$x3.2$min ## Note here we use R's "as.Date" function to convert that bigger ## data structure (with useless "hour" and "minute" elements to a ## date: dat1$x3.3 <- as.Date(dat1$x3.2) ## See? Its class becomes Date str(dat1) ## I was going through the experiment there, I think I ## understand, lets make sure. Import without detecting dates. dat2 <- read.xlsx("date-20160524.xlsx", detectDates = FALSE) summary(dat2) ## The columns that were dates in Excel are now just numbers. ## We can recover the dates like this: dat2$x1time <- convertToDate(dat2$x1) dat2$x2time <- convertToDate(dat2$x2) dat2$x4date <- convertToDate(dat2$x4) dat2$x4datetime <- convertToDateTime(dat2$x4) dat2$x5date <- convertToDate(dat2$x5) dat2$x5datetime <- convertToDateTime(dat2$x5) dat2$x1time - dat2$x2time dat2$x5datetime2 <- strptime(dat2$x5datetime, format = "%Y-%m-%d %H:%M:%OS") ## Previous gives a POSIXlt object str(dat2) dat2$x5datetime2$hour dat2$x5datetime2$min dat2$x5datetime2$sec dat2$x5datetime2 - dat2$x4datetime ## I hope this cleared up some mysteries and did not create (m)any new ## ones. ## Confused enough. Now am reading ## ?POSIXlt ## ?Date ## ?Ops.Date