Cerebral Cortex contains a library of algorithms that are useful for processing data and converting it into features or biomarkers. This page demonstrates a simple GPS clustering algorithm. For more details about the algorithms that are available, please see our documentation. These algorithms are constantly being developed and improved through our own work and the work of other researchers.
In [ ]:
%reload_ext autoreload
from util.dependencies import *
from settings import USER_ID
CC = Kernel("/home/md2k/cc_conf/")
This example utilizes a data generator to protect the privacy of real participants and allows for anyone utilizing this system to explore the data without required institutional review board approvals. This is disabled for this demonstration to not create too much data at once.
In [ ]:
# gen_location_datastream(CC, user_id=USER_ID, stream_name="GPS--org.md2k.phonesensor--PHONE")
In [ ]:
gps_stream = CC.get_stream("GPS--org.md2k.phonesensor--PHONE")
gps_stream.show(3)
gps_stream.summary()
Cerebral Cortex makes it easy to apply built-in algorithms to data streams. In this case, gps_clusters is imported from the algorithm library, then compute is utilized to run this algorithm on the gps_stream to generate a set of centroids. This is the general format for applying algorithm to datastream and makes it easy for researchers to apply validated and tested algorithms to his/her own data without the need to become an expert in the particular set of transformations needed.
Note: the compute method engages the parallel computation capabilities of Cerebral Cortex, which causes all the data to be read from the data storage layer and processed on every computational core available to the system. This allows the computation to run as quickly as possible and to take advantage of powerful clusters from a relatively simple interface. This capability is critical to working with mobile sensor big data where data sizes can exceed 100s of gigabytes per datastream for larger studies.
In [ ]:
from cerebralcortex.algorithms import gps_clusters
centroids = gps_stream.compute(gps_clusters)
centroids.show(truncate=False)
In [ ]:
gps_stream.plot_gps_cords(zoom=8)
In [ ]:
centroids.plot_gps_cords(zoom=12)