datetime library

  • Time is linear
  • progresses as a straightline trajectory from the big bag
  • to now and into the future
  • 日期库官方说明 https://docs.python.org/3.5/library/datetime.html
  • Reasoning about time is important in data analysis

  • Analyzing financial timeseries data
  • Looking at commuter transit passenger flows by time of day
  • Understanding web traffic by time of day
  • Examining seaonality in department store purchases
  • The datetime library

  • understands the relationship between different points of time
  • understands how to do operations on time
  • Example:

  • Which is greater? "10/24/2017" or "11/24/2016"
  • 
    
    In [1]:
    d1 = "10/24/2017"
    d2 = "11/24/2016"
    max(d1,d2)
    
    
    
    
    Out[1]:
    '11/24/2016'

  • How much time has passed?
  • 
    
    In [2]:
    d1 - d2
    
    
    
    
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-2-8e72eafee703> in <module>()
    ----> 1 d1 - d2
    
    TypeError: unsupported operand type(s) for -: 'str' and 'str'

    Obviously that's not going to work.

    We can't do date operations on strings

    Let's see what happens with datetime

    
    
    In [1]:
    import datetime
    d1 = datetime.date(2016,11,24)
    d2 = datetime.date(2017,10,24)
    max(d1,d2)
    
    
    
    
    <class 'datetime.date'>
    
    
    
    In [4]:
    print(d2 - d1)
    
    
    
    
    334 days, 0:00:00
    

  • datetime objects understand time
  • The datetime library contains several useful types

  • date: stores the date (month,day,year)
  • time: stores the time (hours,minutes,seconds)
  • datetime: stores the date as well as the time (month,day,year,hours,minutes,seconds)
  • timedelta: duration between two datetime or date objects
  • datetime.date

    
    
    In [2]:
    import datetime
    century_start = datetime.date(2000,1,1)
    today = datetime.date.today()
    print(century_start,today)
    print("We are",today-century_start,"days into this century")
    print(type(century_start))
    print(type(today))
    
    
    
    
    2000-01-01 2017-08-25
    We are 6446 days, 0:00:00 days into this century
    <class 'datetime.date'>
    <class 'datetime.date'>
    

    For a cleaner output

    
    
    In [6]:
    print("We are",(today-century_start).days,"days into this century")
    
    
    
    
    We are 6445 days into this century
    

    datetime.datetime

    
    
    In [7]:
    century_start = datetime.datetime(2000,1,1,0,0,0)
    time_now = datetime.datetime.now()
    print(century_start,time_now)
    print("we are",time_now - century_start,"days, hour, minutes and seconds into this century")
    
    
    
    
    2000-01-01 00:00:00 2017-08-24 18:09:48.245052
    we are 6445 days, 18:09:48.245052 days, hour, minutes and seconds into this century
    

    datetime objects can check validity

  • A ValueError exception is raised if the object is invalid
  • 
    
    In [8]:
    some_date=datetime.date(2015,2,29)
    #some_date =datetime.date(2016,2,29)
    #some_time=datetime.datetime(2015,2,28,23,60,0)
    
    
    
    
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-8-16e372a33db1> in <module>()
    ----> 1 some_date=datetime.date(2015,2,29)
          2 #some_date =datetime.date(2016,2,29)
          3 #some_time=datetime.datetime(2015,2,28,23,60,0)
    
    ValueError: day is out of range for month

    datetime.timedelta

    Used to store the duration between two points in time

    
    
    In [3]:
    century_start = datetime.datetime(2050,1,1,0,0,0)
    time_now = datetime.datetime.now()
    time_since_century_start = time_now - century_start
    print("days since century start",time_since_century_start.days)
    print("seconds since century start",time_since_century_start.total_seconds())
    print("minutes since century start",time_since_century_start.total_seconds()/60)
    print("hours since century start",time_since_century_start.total_seconds()/60/60)
    
    
    
    
    days since century start -11817
    seconds since century start -1020947887.555188
    minutes since century start -17015798.1259198
    hours since century start -283596.63543199666
    

    datetime.time

    
    
    In [10]:
    date_and_time_now = datetime.datetime.now()
    time_now = date_and_time_now.time()
    print(time_now)
    
    
    
    
    19:44:37.142884
    

    You can do arithmetic operations on datetime objects

  • You can use timedelta objects to calculate new dates or times from a given date
  • 
    
    In [11]:
    today=datetime.date.today()
    five_days_later=today+datetime.timedelta(days=5)
    print(five_days_later)
    
    
    
    
    2017-08-29
    
    
    
    In [12]:
    now=datetime.datetime.today()
    five_minutes_and_five_seconds_later = now + datetime.timedelta(minutes=5,seconds=5)
    print(five_minutes_and_five_seconds_later)
    
    
    
    
    2017-08-24 19:50:09.630242
    
    
    
    In [13]:
    now=datetime.datetime.today()
    five_minutes_and_five_seconds_earlier = now+datetime.timedelta(minutes=-5,seconds=-5)
    print(five_minutes_and_five_seconds_earlier)
    
    
    
    
    2017-08-24 21:39:23.763762
    

  • But you can't use timedelta on time objects. If you do, you'll get a TypeError exception
  • 
    
    In [14]:
    time_now=datetime.datetime.now().time() #Returns the time component (drops the day)
    print(time_now)
    thirty_seconds=datetime.timedelta(seconds=30)
    time_later=time_now+thirty_seconds
    #Bug or feature?
    
    
    
    
    22:02:21.552801
    
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-14-2dea49afbdf4> in <module>()
          2 print(time_now)
          3 thirty_seconds=datetime.timedelta(seconds=30)
    ----> 4 time_later=time_now+thirty_seconds
          5 #Bug or feature?
    
    TypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.timedelta'
    
    
    In [17]:
    #But this is Python
    #And we can always get around something by writing a new function!
    #Let's write a small function to get around this problem
    def add_to_time(time_object,time_delta):
        import datetime
        temp_datetime_object = datetime.datetime(500,1,1,time_object.hour,time_object.minute,time_object.second)
        print(temp_datetime_object)
        return (temp_datetime_object+time_delta).time()
    
    
    
    In [18]:
    #And test it
    time_now=datetime.datetime.now().time()
    thirty_seconds=datetime.timedelta(seconds=30)
    print(time_now,add_to_time(time_now,thirty_seconds))
    
    
    
    
    0500-01-01 22:37:07
    22:37:07.239431 22:37:37
    

    datetime and strings

    More often than not, the program will need to get the date or time from a string: From a website (bus/train timings) From a file (date or datetime associated with a stock price) From the user (from the input statement) Python needs to parse the string so that it correctly creates a date or time object

    datetime.strptime

  • datetime.strptime(): grabs time from a string and creates a date or datetime or time object
  • The programmer needs to tell the function what format the string is using
  • See http://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html for how to specify the format
  • 
    
    In [30]:
    date='01-Apr-03'
    date_object=datetime.datetime.strptime(date,'%d-%b-%y')
    print(date_object)
    
    
    
    
    2003-04-01 00:00:00
    
    
    
    In [26]:
    #Unfortunately, there is no similar thing for time delta
    #So we have to be creative!
    bus_travel_time='2:15:30'
    hours,minutes,seconds=bus_travel_time.split(':')
    x=datetime.timedelta(hours=int(hours),minutes=int(minutes),seconds=int(seconds))
    print(x)
    
    
    
    
    2:15:30
    
    
    
    In [27]:
    #Or write a function that will do this for a particular format
    def get_timedelta(time_string):
        hours,minutes,seconds = time_string.split(':')
        import datetime
        return datetime.timedelta(hours=int(hours),minutes=int(minutes),seconds=int(seconds))
    

    datetime.strftime

  • The strftime function flips the strptime function. It converts a datetime object to a string
  • with the specified format
  • 
    
    In [28]:
    now = datetime.datetime.now()
    string_now = datetime.datetime.strftime(now,'%m/%d/%y %H:%M:%S')
    print(now,string_now)
    print(str(now)) #Or you can use the default conversion
    
    
    
    
    2017-08-24 23:03:35.197581 08/24/17 23:03:35
    2017-08-24 23:03:35.197581
    
    
    
    In [ ]: