EBTgymEnv()
class comes preconfigured for quick setting. Basicaly one need to provide at least data file keyword argument to set it up.
BTgym relies on Backtrader framework for actual environment rendering. Environment customisation can be done either via setting basic set of parameters, inherited from Backtrader computational engine, or passing to env. complete engine subclass. This example covers basic setting, while later option gives complete control over backtasting logic and environment becames as flexible as Backtrader itself.
Besides, there is another bunch of vital options related to reinforcement learning setting: observation and action space parameters and episode setting.
One can eyeball internal environment parameters by looking at nested MyEnvironment.params
dictionary consisting of these subdictionaries:
params['dataset']
, params['engine']
,params['strategy']
,params['render']
.
In [ ]:
from btgym import BTgymEnv
# Handy function:
def under_the_hood(env):
"""Shows environment internals."""
for attr in ['dataset','strategy','engine','renderer','network_address']:
print ('\nEnv.{}: {}'.format(attr, getattr(env, attr)))
for params_name, params_dict in env.params.items():
print('\nParameters [{}]: '.format(params_name))
for key, value in params_dict.items():
print('{} : {}'.format(key,value))
In [ ]:
# Simpliest trading environment,
# using year-long dataset of one minute bars for EUR/USD currency pair:
MyEnvironment = BTgymEnv(filename='./data/DAT_ASCII_EURUSD_M1_2016.csv',)
# Print environment configuration:
under_the_hood(MyEnvironment)
# Clean up:
MyEnvironment.close()
In [ ]:
from gym import spaces
MyEnvironment = BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
# Dataset and single random episode related parameters:
# We start trading on mondays, thuesdays and wednesdays:
start_weekdays=[0, 1, 2],
# Want total episode duration to be no more than 1 day 23h 55min:
episode_duration={'days': 1, 'hours': 23, 'minutes': 55},
# Want to start every episode at the begiining of the day:
start_00=True,
# Broker and trade realted:
# Set initial capital:
start_cash=100,
# Set broker commission as 0.2% of operation value:
broker_commission=0.002,
# We use fixed stake of size 10:
fixed_stake=10,
# We want stop episode if 30% of initial capital is lost:
drawdown_call=30,
# RL environment related parameters:
# Set observation shape. By convention, first dimension
# is time embedding dimensionality;
# that's basically means we get sequence of 30 last
# [o,h,l,c] candels as our one-step environment observation:
state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),
# BTgym uses multi-modal observation space which is basically dictionary
# consisting of simple gym spaces (Box, discrete, etc.)
# For the built-in `raw_state` setting high and low is dummy, because
# environment will infer values from entire dataset statistic.
# Other parameters:
# Network port to use; note that using multiply environments at once reqires expliciltly
# setting different ports to avoid meesing up. If your jupyter kernel suddenly dies
# when running new environment - that's may be because of port conflict,
# or 'previous' environment instance (client-side) is still running.
# Don't panic, just clear up and restart kernel,
# or use env.close() to shut down all the services.
port=5555,
# Data-server port to use, same as above apply:
#data_port=4600,
# Be chatty: settting this to 1 makes environment report what's going on;
# 2 is for debugging, dumps out a lot of data:
verbose=1,)
# Eyeball configuration:
under_the_hood(MyEnvironment)
# Clean up:
MyEnvironment.close()
OpenaAI way of making environment is to register it with cpecific set of parameters under some unique name and instantiate it via calling make()
method. This helps for standartization and correct evaluation of results uploaded to Gym board.
In [ ]:
import gym
from gym import spaces
# Set single dictionary of parameters:
env_params = dict(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',
start_weekdays=[0, 1, 2],
episode_duration={'days': 1, 'hours': 23, 'minutes': 55},
start_00=True,
start_cash=100,
broker_commission=0.002,
fixed_stake=10,
drawdown_call=30,
state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),
port=5002,
data_port=4800,
verbose=1,)
# Register with unique name (watch out for OpenAI namesetting conventions):
gym.envs.register(id='backtrader-v46',
entry_point='btgym:BTgymEnv',
kwargs=env_params)
# Make environment:
MyEnvironment = gym.make('backtrader-v46')
# Clean up
MyEnvironment.close()
Just for giving sense of env. operation flow, our agent will be just mindless random picker; it performs no actual training. Run it for several episodes to see how fast all the money get lost.
In [ ]:
import itertools
import random
# Will need those
# to display rendered images inline:
import IPython.display as Display
import PIL.Image as Image
# Some utility functions:
def to_string(dictionary):
"""Convert dictionary to block of text."""
text = ''
for k, v in dictionary.items():
if type(v) in [float]:
v = '{:.4f}'.format(v)
text += '{}: {}\n'.format(k, v)
return(text)
def show_rendered_image(rgb_array):
"""
Convert numpy array to RGB image using PILLOW and
show it inline using IPykernel.
This method doesn't requires matplotlib to be loaded.
"""
Display.display(Image.fromarray(rgb_array))
# Number episodes to run:
num_episodes = 2
# Render state every:
state_render=500
Pay attention to log output: when called for first time, env.reset()
will start the server and calls for episode; server than samples episode data, checks it for consistency, starts backtesting and returns initial state observation.
In [ ]:
# Run it:
for episode in range(num_episodes):
# Calling reset() before every episode.
init_state = MyEnvironment.reset()
print('\nEPISODE [{}]:'.format(episode + 1))
# Render and show first step:
show_rendered_image(MyEnvironment.render('human'))
# Repeat until episode end:
for _ in itertools.count():
#Choose random action:
rnd_action = MyEnvironment.action_space.sample()
# Make a step in the environment:
obs, reward, done, info = MyEnvironment.step(rnd_action)
# Show state every 500th step
# and when episode is finished:
if info[-1]['step'] % state_render == 0 or done:
show_rendered_image(MyEnvironment.render('human'))
if done: break
# Print episode statistic (quite modest for now since we didn't added any observers etc.)
print('SUMMARY:\n{}\nINFO [last observation]:\n{}'.
format(to_string(MyEnvironment.get_stat()), to_string(info[-1])))
# Render and show episode statisic:
print('BACKTRADER SUMMARY PLOT:')
show_rendered_image(MyEnvironment.render('episode'))
# Clean up:
MyEnvironment.close()
In [ ]:
MyEnvironment.close()
In [ ]: