Basic settings and parameters

EBTgymEnv() class comes preconfigured for quick setting. Basicaly one need to provide at least data file keyword argument to set it up.

BTgym relies on Backtrader framework for actual environment rendering. Environment customisation can be done either via setting basic set of parameters, inherited from Backtrader computational engine, or passing to env. complete engine subclass. This example covers basic setting, while later option gives complete control over backtasting logic and environment becames as flexible as Backtrader itself.

Besides, there is another bunch of vital options related to reinforcement learning setting: observation and action space parameters and episode setting.

One can eyeball internal environment parameters by looking at nested MyEnvironment.params dictionary consisting of these subdictionaries:

  • params['dataset'],
  • params['engine'],
  • params['strategy'],
  • params['render'].
  • Look at source files for exact parameters descriptions, since complete doc. reference is yet to come. ##### Here all parameters are left to defaults values:

In [ ]:
from btgym import BTgymEnv

# Handy function:
def under_the_hood(env):
    """Shows environment internals."""
    for attr in ['dataset','strategy','engine','renderer','network_address']:
        print ('\nEnv.{}: {}'.format(attr, getattr(env, attr)))

    for params_name, params_dict in env.params.items():
        print('\nParameters [{}]: '.format(params_name))
        for key, value in params_dict.items():
            print('{} : {}'.format(key,value))

In [ ]:
# Simpliest trading environment,
# using year-long dataset of one minute bars for EUR/USD currency pair: 

 
MyEnvironment = BTgymEnv(filename='./data/DAT_ASCII_EURUSD_M1_2016.csv',)

# Print environment configuration:
under_the_hood(MyEnvironment)

# Clean up:
MyEnvironment.close()

More control:

One can tweak environment setup by passing set of kwargs:

In [ ]:
from gym import spaces

MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
                         
                     # Dataset and single random episode related parameters:
                         
                         # We start trading on mondays, thuesdays and wednesdays:
                         start_weekdays=[0, 1, 2],
                         # Want total episode duration to be no more than 1 day 23h 55min:
                         episode_duration={'days': 1, 'hours': 23, 'minutes': 55},
                         # Want to start every episode at the begiining of the day:
                         start_00=True,
                         
                     # Broker and trade realted:
                         
                         # Set initial capital:
                         start_cash=100,
                         # Set broker commission as 0.2% of operation value:
                         broker_commission=0.002,
                         # We use fixed stake of size 10:
                         fixed_stake=10,
                         # We want stop episode if 30% of initial capital is lost:
                         drawdown_call=30,
                         
                     # RL environment related parameters:
                         
                         # Set observation shape. By convention, first dimension 
                         # is time embedding dimensionality;
                         # that's basically means we get sequence of 30 last  
                         # [o,h,l,c] candels as our one-step environment observation:
                         
                         state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),
                                          
                         # BTgym uses multi-modal observation space which is basically dictionary
                         # consisting of simple gym spaces (Box, discrete, etc.)
                         # For the built-in `raw_state` setting high and low is dummy, because
                         # environment will infer values from entire dataset statistic.
                         
                     # Other parameters:
                         
                         # Network port to use; note that using multiply environments at once reqires expliciltly
                         # setting different ports to avoid meesing up. If your jupyter kernel suddenly dies
                         # when running new environment - that's may be because of port conflict,
                         # or 'previous' environment instance (client-side) is still running.
                         # Don't panic, just clear up and restart kernel,
                         # or use env.close() to shut down all the services.
                         port=5555,
                         # Data-server port to use, same as above apply:
                         #data_port=4600,
                         # Be chatty: settting this to 1 makes environment report what's going on;
                         # 2 is for debugging, dumps out a lot of data:
                         verbose=1,)

# Eyeball configuration:
under_the_hood(MyEnvironment)

# Clean up:
MyEnvironment.close()

Registering environment:

OpenaAI way of making environment is to register it with cpecific set of parameters under some unique name and instantiate it via calling make() method. This helps for standartization and correct evaluation of results uploaded to Gym board.

That's how you do it (same parameters as above):

In [ ]:
import gym
from gym import spaces

# Set single dictionary of parameters:

env_params = dict(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
                  start_weekdays=[0, 1, 2],
                  episode_duration={'days': 1, 'hours': 23, 'minutes': 55},
                  start_00=True,
                  start_cash=100,
                  broker_commission=0.002,
                  fixed_stake=10,
                  drawdown_call=30,
                  state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),
                  port=5002,
                  data_port=4800,
                  verbose=1,)

# Register with unique name (watch out for OpenAI namesetting conventions):

gym.envs.register(id='backtrader-v46',
                  entry_point='btgym:BTgymEnv',
                  kwargs=env_params)

# Make environment:
                  
MyEnvironment = gym.make('backtrader-v46')

# Clean up
MyEnvironment.close()

Running agent:

Just for giving sense of env. operation flow, our agent will be just mindless random picker; it performs no actual training. Run it for several episodes to see how fast all the money get lost.

  • we'll plot states observationas every 500th and final step, episode summary and rendering;
  • set verbosity=0 to turn of excesive messaging.

In [ ]:
import itertools
import random

# Will need those
# to display rendered images inline:
import IPython.display as Display
import PIL.Image as Image


# Some utility functions:

def to_string(dictionary):
    """Convert dictionary to block of text."""
    text = ''
    for k, v in dictionary.items():
        if type(v) in [float]:
            v = '{:.4f}'.format(v)
        text += '{}: {}\n'.format(k, v)
    return(text)

def show_rendered_image(rgb_array):
    """
    Convert numpy array to RGB image using PILLOW and
    show it inline using IPykernel.
    This method doesn't requires matplotlib to be loaded.
    """
    Display.display(Image.fromarray(rgb_array))

# Number  episodes to run:
num_episodes = 2

# Render state every:
state_render=500

Pay attention to log output: when called for first time, env.reset() will start the server and calls for episode; server than samples episode data, checks it for consistency, starts backtesting and returns initial state observation.


In [ ]:
# Run it:
for episode in range(num_episodes):
    
    # Calling reset() before every episode.

    init_state = MyEnvironment.reset()
    
    print('\nEPISODE [{}]:'.format(episode + 1))
    
    # Render and show first step:
    show_rendered_image(MyEnvironment.render('human'))
    
    # Repeat until episode end:
    for _ in itertools.count(): 
        
        #Choose random action:
        rnd_action = MyEnvironment.action_space.sample()
        
        # Make a step in the environment:
        obs, reward, done, info = MyEnvironment.step(rnd_action)
        
        # Show state every 500th step
        # and when episode is finished:
        if info[-1]['step'] % state_render == 0 or done:
            show_rendered_image(MyEnvironment.render('human'))
                
        if done: break
            
    # Print episode statistic (quite modest for now since we didn't added any observers etc.)
    print('SUMMARY:\n{}\nINFO [last observation]:\n{}'.
        format(to_string(MyEnvironment.get_stat()), to_string(info[-1])))
    # Render and show episode statisic:
    print('BACKTRADER SUMMARY PLOT:')
    show_rendered_image(MyEnvironment.render('episode'))

# Clean up:
MyEnvironment.close()

In [ ]:
MyEnvironment.close()

In [ ]: