Geekbench benchmark on Android

Geekbench4 is an app offering several benchmarks to run on android smartphones. The one used in this notebook is the 'CPU' benchmark, which runs several workloads that follow the lines of what is commonly run by smartphones (AES, JPEG codec, FFT, and so on). The benchmark runs all the tests in 'Single-Core' mode as well as in 'Multi-Core' in order to compare the single-thread and multi-thread performances of the device.

Do note that the benchmark will attempt to upload its results, which includes some hardware information


In [1]:
from conf import LisaLogging
LisaLogging.setup()


2017-05-03 10:54:02,800 INFO    : root         : Using LISA logging configuration:
2017-05-03 10:54:02,801 INFO    : root         :   /home/vagrant/lisa/logging.conf

In [2]:
%pylab inline

import json
import os

# Support to access the remote target
import devlib
from env import TestEnv

# Import support for Android devices
from android import Screen, Workload

# Support for trace events analysis
from trace import Trace

# Suport for FTrace events parsing and visualization
import trappy

import pandas as pd


Populating the interactive namespace from numpy and matplotlib
2017-05-03 10:54:03,346 WARNING : EnergyModel  : Unusual max capacity (1023), overriding capacity_scale

Support Functions

This function helps us run our experiments:


In [3]:
def experiment():
    
    # Configure governor
    target.cpufreq.set_all_governors('sched')
    
    # Get workload
    wload = Workload.getInstance(te, 'Geekbench')
    
    # Run Geekbench workload
    wload.run(te.res_dir, test_name='CPU', collect='ftrace')
        
    # Dump platform descriptor
    te.platform_dump(te.res_dir)

Test environment setup

For more details on this please check out examples/utils/testenv_example.ipynb.

devlib requires the ANDROID_HOME environment variable configured to point to your local installation of the Android SDK. If you have not this variable configured in the shell used to start the notebook server, you need to run a cell to define where your Android SDK is installed or specify the ANDROID_HOME in your target configuration.

In case more than one Android device are conencted to the host, you must specify the ID of the device you want to target in my_target_conf. Run adb devices on your host to get the ID.


In [4]:
# Setup target configuration
my_conf = {

    # Target platform and board
    "platform"     : 'android',
    "board"        : 'pixel',
    
    # Device
    "device"       : "0123456789ABCDEF",
    
    # Android home
    "ANDROID_HOME" : "/home/vagrant/lisa/tools/android-sdk-linux/",

    # Folder where all the results will be collected
    "results_dir" : datetime.datetime.now()\
                    .strftime("Geekbench_example_" + '%Y%m%d_%H%M%S'),

    # Define devlib modules to load
    "modules"     : [
        'cpufreq'       # enable CPUFreq support
    ],

    # FTrace events to collect for all the tests configuration which have
    # the "ftrace" flag enabled
    "ftrace"  : {
         "events" : [
            "sched_switch",
            "sched_wakeup",
            "sched_wakeup_new",
            "sched_overutilized",
            "sched_load_avg_cpu",
            "sched_load_avg_task",
            "cpu_capacity",
            "cpu_frequency",
         ],
         "buffsize" : 100 * 1024,
    },

    # Tools required by the experiments
    "tools"   : [ 'trace-cmd', 'taskset'],
}

In [5]:
# Initialize a test environment using:
te = TestEnv(my_conf, wipe=False)
target = te.target


2017-05-03 10:54:03,551 INFO    : TestEnv      : Using base path: /home/vagrant/lisa
2017-05-03 10:54:03,551 INFO    : TestEnv      : Loading custom (inline) target configuration
2017-05-03 10:54:03,552 INFO    : TestEnv      : External tools using:
2017-05-03 10:54:03,553 INFO    : TestEnv      :    ANDROID_HOME: /home/vagrant/lisa/tools/
2017-05-03 10:54:03,554 INFO    : TestEnv      :    CATAPULT_HOME: /home/vagrant/lisa/tools//platform-tools/systrace/catapult
2017-05-03 10:54:03,555 INFO    : TestEnv      : Devlib modules to load: ['bl', 'cpufreq']
2017-05-03 10:54:03,556 INFO    : TestEnv      : Connecting Android target [DEFAULT]
2017-05-03 10:54:03,557 INFO    : TestEnv      : Connection settings:
2017-05-03 10:54:03,558 INFO    : TestEnv      :    None
2017-05-03 10:54:03,761 INFO    : android      : ls command is set to ls -1
2017-05-03 10:54:04,637 INFO    : TestEnv      : Initializing target workdir:
2017-05-03 10:54:04,638 INFO    : TestEnv      :    /data/local/tmp/devlib-target
2017-05-03 10:54:06,555 INFO    : TestEnv      : Topology:
2017-05-03 10:54:06,557 INFO    : TestEnv      :    [[0, 1], [2, 3]]
2017-05-03 10:54:06,748 INFO    : TestEnv      : Loading default EM:
2017-05-03 10:54:06,749 INFO    : TestEnv      :    /home/vagrant/lisa/libs/utils/platforms/pixel.json
2017-05-03 10:54:07,301 INFO    : TestEnv      : Enabled tracepoints:
2017-05-03 10:54:07,302 INFO    : TestEnv      :    sched_switch
2017-05-03 10:54:07,303 INFO    : TestEnv      :    sched_wakeup
2017-05-03 10:54:07,304 INFO    : TestEnv      :    sched_wakeup_new
2017-05-03 10:54:07,305 INFO    : TestEnv      :    sched_overutilized
2017-05-03 10:54:07,306 INFO    : TestEnv      :    sched_load_avg_cpu
2017-05-03 10:54:07,307 INFO    : TestEnv      :    sched_load_avg_task
2017-05-03 10:54:07,308 INFO    : TestEnv      :    cpu_capacity
2017-05-03 10:54:07,309 INFO    : TestEnv      :    cpu_frequency
2017-05-03 10:54:07,310 INFO    : TestEnv      : Set results folder to:
2017-05-03 10:54:07,311 INFO    : TestEnv      :    /home/vagrant/lisa/results/Geekbench_example_20170503_105403
2017-05-03 10:54:07,312 INFO    : TestEnv      : Experiment results available also in:
2017-05-03 10:54:07,313 INFO    : TestEnv      :    /home/vagrant/lisa/results_latest

Workloads execution

This is done using the experiment helper function defined above which is configured to run a Geekbench - CPU experiment.


In [6]:
# Initialize Workloads for this test environment
results = experiment()


2017-05-03 10:54:07,990 INFO    : Workload     : Supported workloads available on target:
2017-05-03 10:54:07,992 INFO    : Workload     :   gmaps, youtube, jankbench, geekbench
2017-05-03 10:54:10,644 INFO    : Screen       : Force manual orientation
2017-05-03 10:54:10,645 INFO    : Screen       : Set orientation: PORTRAIT
2017-05-03 10:54:12,496 INFO    : Screen       : Set brightness: 0%
2017-05-03 10:54:14,950 INFO    : Geekbench    : adb -s HT67M0300128 logcat ActivityManager:* System.out:I *:S GEEKBENCH_RESULT:*
2017-05-03 10:54:17,071 INFO    : Geekbench    : FTrace START
2017-05-03 10:58:23,429 INFO    : Geekbench    : FTrace STOP
2017-05-03 10:58:33,944 INFO    : Screen       : Set orientation: AUTO
2017-05-03 10:58:35,526 INFO    : Screen       : Set brightness: AUTO

Results analysis

Geekbench4 uses a baseline score of 4000, which is the benchmark score of an Intel Core i7-6600U. Higher scores are better, with double the score indicating double the performance. You can have a look at the results for several android phones here https://browser.primatelabs.com/android-benchmarks


In [7]:
class Geekbench(object):
    """
    Geekbench json results parsing class
    """
    def __init__(self, filepath):
        with open(filepath) as fd:
            self.__json = json.loads(fd.read())
        
        self.benchmarks = {}
        for section in self.__json["sections"]:
            self.benchmarks[section["name"]] = section
            for workload in section["workloads"]:
                self.benchmarks[section["name"]][workload["name"]] = workload     
            
    def name(self):
        """Get a human-readable name for the geekbench run
        """
        gov = ""
        build = ""
        for metric in self.__json["metrics"]:
            if metric["name"] == "Governor":
                gov = metric["value"]
            elif metric["name"] == "Build":
                build = metric["value"]

        return "[build]=\"{}\" [governor]=\"{}\"".format(build, gov)
    
    def benchmarks_names(self):
        """Get a list of benchmarks (e.g. Single-Core, Multi-Core) found in the run results        
        """
        return [section["name"] for section in self.__json["sections"]]
    
    def workloads_names(self):
        """Get a list of unique workloads (e.g. EAS, Dijkstra) found in the run results
        """
        return [workload["name"] for workload in self.benchmarks.values()[0]["workloads"]]
    
    def global_scores(self):
        """Get the overall scores of each benchmark
        """
        data = {}
        for benchmark in self.benchmarks_names():
            data[benchmark] = self.benchmarks[benchmark]["score"]
        return data
        
    def detailed_scores(self):
        """Get the detailed workload scores of each benchmark
        """
        benchmark_fields = ["score", "runtime_mean", "rate_string"]
        benches = {}
        benchmarks = self.benchmarks_names()
        workloads = self.workloads_names() 
        
        for benchmark in benchmarks:
            data = {}
            for workload in workloads:
                data[workload] = {}
                for field in benchmark_fields:
                    data[workload][field] = self.benchmarks[benchmark][workload][field]        
            benches[benchmark] = data
            
        return benches

In [8]:
def display_bench_results(geekbench, detailed=False):
    print "===== Global results ====="
    
    scores = geekbench.global_scores()
    
    # Build dataframe for display
    row = []
    for bench_type, score in scores.iteritems():
        row.append(score)
        
    df = pd.DataFrame(data=row, index=scores.keys(), columns=["Global score"])
    display(df)
    
    if not detailed:
        return
    
    print "===== Detailed results ====="
    
    scores = geekbench.detailed_scores()
    
    for benchmark, results in geekbench.detailed_scores().iteritems():
        print "----- {} benchmark -----".format(benchmark)
        # Build dataframe for display
        data = []
        idx = []
        columns = results.values()[0].keys()
        for workload, fields in results.iteritems():
            data.append(tuple(fields.values()))
            idx.append(workload)
        display (pd.DataFrame(data=data, index=idx, columns=columns))

In [9]:
for f in os.listdir(te.res_dir):
    if f.endswith(".gb4"):
        geekbench = Geekbench(te.res_dir + "/" + f)
        
        print "Analysing geekbench {}".format(geekbench.name())
        display_bench_results(geekbench, True)


Analysing geekbench [build]="sailfishf-userdebug 7.1.1 NMF26P 3525730 dev-keys" [governor]="sched"
===== Global results =====
Global score
Single-Core 1594
Multi-Core 4170
===== Detailed results =====
----- Single-Core benchmark -----
score runtime_mean rate_string
AES 649 0.261358 501.0 MB/sec
HDR 3050 0.394782 11.1 Mpixels/sec
Rigid Body Physics 2147 0.239238 6285.5 FPS
HTML5 Parse 1565 0.173668 7.11 MB/sec
Lua 1220 0.254486 1.25 MB/sec
Camera 2605 0.148324 7.22 images/sec
Histogram Equalization 1425 0.245729 44.6 Mpixels/sec
SQLite 1189 0.612012 33.0 Krows/sec
Face Detection 2106 0.123343 615.2 Ksubwindows/sec
Memory Copy 2691 0.269374 7.46 GB/sec
Memory Latency 917 1.001564 471.8 ns
Canny 2191 0.167685 30.4 Mpixels/sec
PDF Rendering 2137 0.452174 56.8 Mpixels/sec
Gaussian Blur 2313 0.126649 40.5 Mpixels/sec
Speech Recognition 1263 1.126760 10.8 Words/sec
LLVM 1817 1.057730 124.9 functions/sec
Ray Tracing 1818 0.339196 265.5 Kpixels/sec
JPEG 2605 0.248902 21.0 Mpixels/sec
SGEMM 551 0.790329 11.7 Gflops
LZMA 1211 0.495195 1.89 MB/sec
SFFT 1130 0.169216 2.82 Gflops
Memory Bandwidth 2420 0.100460 12.9 GB/sec
N-Body Physics 1445 0.256496 1.08 Mpairs/sec
Dijkstra 2063 0.374775 1.40 MTE/sec
HTML5 DOM 613 0.913885 556.2 KElements/sec
----- Multi-Core benchmark -----
score runtime_mean rate_string
AES 2086 0.336575 1.57 GB/sec
HDR 8435 0.588926 30.6 Mpixels/sec
Rigid Body Physics 5976 0.363937 17496.3 FPS
HTML5 Parse 4747 0.269989 21.6 MB/sec
Lua 3475 0.385767 3.57 MB/sec
Camera 7272 0.203743 20.2 images/sec
Histogram Equalization 3785 0.368308 118.3 Mpixels/sec
SQLite 3287 1.166350 91.1 Krows/sec
Face Detection 5702 0.188087 1.67 Msubwindows/sec
Memory Copy 3974 0.378651 11.0 GB/sec
Memory Latency 1865 0.705881 232.1 ns
Canny 6648 0.265390 92.2 Mpixels/sec
PDF Rendering 5910 0.663689 157.0 Mpixels/sec
Gaussian Blur 6175 0.200005 108.2 Mpixels/sec
Speech Recognition 3755 1.520384 32.1 Words/sec
LLVM 6785 1.141085 466.6 functions/sec
Ray Tracing 5002 0.498715 730.4 Kpixels/sec
JPEG 7192 0.363033 57.9 Mpixels/sec
SGEMM 1697 1.057814 35.9 Gflops
LZMA 3977 0.587279 6.21 MB/sec
SFFT 3443 0.250660 8.58 Gflops
Memory Bandwidth 3375 0.142451 18.0 GB/sec
N-Body Physics 3891 0.398617 2.91 Mpairs/sec
Dijkstra 5168 0.601840 3.50 MTE/sec
HTML5 DOM 2158 1.031061 1.96 MElements/sec

Analysing several runs

It can be interesting to compare Geekbench results with different parameters (kernel, drivers) and even different devices to gauge the impact of these parameters. As Geekbench results can vary a bit from one run to another, having a set of repeated runs is preferable.

The following section will grab the results of all the Geekbench_exemple_* results found in the LISA results directory


In [10]:
import glob

def fetch_results():
    results_path = os.path.join(te.LISA_HOME, "results")
    
    results_dirs = [results_path + "/" + d for d in os.listdir(results_path) if d.startswith("Geekbench_example_")]
    
    res = []
    
    for d in results_dirs:
        bench_file = glob.glob("{}/*.gb4".format(d))[0]
        res.append(Geekbench(bench_file))
        
    return res

def compare_runs():
    geekbenches = fetch_results()
    
    # Pick one run to build a baseline template
    benchmarks = geekbenches[0].benchmarks_names()
    workloads = geekbenches[0].workloads_names()
    
    stats  = ["avg", "min", "max"]
    count = len(geekbenches)
    
    print "Parsing {} runs".format(count)

    
    # Initialize stats
    results = {benchmark : 
                        {"min" : sys.maxint, "max" : 0, "avg" : 0} 
               for benchmark in benchmarks}
    
    # Get all the data
    for benchmark in results.iterkeys():
        for bench in geekbenches:
            score = bench.global_scores()[benchmark]
            
            if score > results[benchmark]["max"]:
                results[benchmark]["max"] = score
                
            if score < results[benchmark]["min"]:
                results[benchmark]["min"] = score
            
            results[benchmark]["avg"] += score
        
        results[benchmark]["avg"] /= count
        
    # Convert data to Dataframe
    data = []

    for benchmark in results.iterkeys():
        row = []
        for stat in stats:
            row.append(results[benchmark][stat])
        data.append(tuple(row))
       
    df = pd.DataFrame(data, index=results.iterkeys(), columns=stats)
    
    return df

In [11]:
display(compare_runs())


Parsing 2 runs
avg min max
Single-Core 1602 1594 1610
Multi-Core 4176 4170 4182