The aim here is to find the different factors that have a significant impact on the performances of roaring bitmaps operations. We only focus on one operation, the union between two roaring bitmaps.
We consider the following (boolean) factors:
run_optimize is used before computing the union. This function triggers a check on all the containers to see if a container conversion is required (from bitset/array to run or from run to bitset/array). Note that run containers may still be used even without this function call.-O3 option, false it they are compiled with -O0 option.All results of this section are in the file results/preliminary_results.csv. They have been generated with the following command:
export LD_LIBRARY_PATH="`pwd`/build/:$LD_LIBRARY_PATH"
./scripts/preliminary_runner.py -n 1024 results/preliminary_results.csv
It runs 1024 experiments. For each experiment, every factor is randomly set to true or false with equal probability.
In [1]:
library(ggplot2)
suppressWarnings(suppressMessages(library(FrF2))) # FrF2 outputs a bunch of useless messages...
In [2]:
all_results <- read.csv("results/old_preliminary_results.csv")
In [3]:
all_results_aov <- aov(time~(large1+large2+dense1+dense2+copy_on_write+run_containers+amalgamation+gcc_optimization+avx_enabled)^2, data=all_results)
summary(all_results_aov)
In this first AOV, we observe that the sizes of the sets as well as their densities have a huge impact on the performance of the union. The use of the GCC optimization is also very significant.
The other variables however do not seem to have any impact.
However, we should check if the hypothesis required to perform an AOV are fulfilled: let us draw the plots.
In [4]:
layout(matrix(c(1,2,3,4),2,2))
plot(all_results_aov)
The plots are terrible. The two first ones show that the residuals are not uniformly distributed. They are more spread for large fit values than for small ones, which is a sign of non-constant variance. Also, they are not distributed around a flat line centred on zero, but rather draw a curvilinear plot: our model is not well suited.
Finally, the Q-Q plot shows that the residuals do not follow a Normal distribution.
In [5]:
print(var(all_results[all_results["large1"] == "True" & all_results["large2"] == "True",]$time))
print(var(all_results[all_results["large1"] == "False" & all_results["large2"] == "False",]$time))
print(var(all_results[all_results["dense1"] == "True" & all_results["dense2"] == "True",]$time))
print(var(all_results[all_results["dense1"] == "False" & all_results["dense2"] == "False",]$time))
This confirm the heteroscedasticity: the variance of the results is several orders of magnitude higher for large and non-dense datasets than for non-large and dense ones.
We will therefore apply the AOV by fixing these variables to a given value.
In [6]:
MEPlot(all_results_aov, abbrev=4, select=c(1, 2, 3, 4, 5, 6, 7, 8, 9), response="time")
In [7]:
MEPlot(all_results_aov, abbrev=4, select=c(1, 2, 3, 4), response="time")
In [8]:
MEPlot(all_results_aov, abbrev=4, select=c(5, 6, 7, 8, 9), response="time")
In [9]:
IAPlot(all_results_aov, abbrev=4, show.alias=FALSE, select=c(5, 6, 7, 8, 9))
In [10]:
large_dense <- all_results[all_results["large1"]=="True" & all_results["large2"]=="True" & all_results["dense1"]=="True" & all_results["dense2"]=="True",]
large_dense_aov <- aov(time~(copy_on_write+run_containers+amalgamation+gcc_optimization+avx_enabled)^2, data=large_dense)
summary(large_dense_aov)
In [11]:
layout(matrix(c(1,2,3,4),2,2))
plot(large_dense_aov)
The plots look more “conform” here. There is nearly no pattern for the residuals vs fitted values: the points are uniformly distributed around the x axis. The QQ-plot is also almost a line.
The summary shows that, for large and dense datasets, the factors with a significant impact are gcc_optimization and avx_enabled. Also, the impact of the amalgamation factor seems to be correlated with gcc_optimization and avx_enabled.
In [12]:
MEPlot(large_dense_aov, abbrev=4, select=c(1, 2, 3, 4, 5), response="time")
In [13]:
IAPlot(large_dense_aov, abbrev=4, show.alias=FALSE, select=c(1, 2, 3, 4, 5))
AVX strange thing
It seems that enabling AVX may lead to worse performances. It happens when GCC optimization and amalgamation are disabled, for large and dense roaring bitmaps.
Sample code:
import numpy
from utils import *
def f(avx):
init_directory(BUILD_DIR)
compile_exec(amalgamation=False, gcc_optimization=False, avx_enabled=avx)
t = run(size1=2**20, universe1=2**21, size2=2**20, universe2=2**21,
copy_on_write=False, run_containers=True)
os.chdir('..')
return t
avx_true = [f(True) for _ in range(5)]
avx_false = [f(False) for _ in range(5)]
print(numpy.mean(avx_true))
print(numpy.mean(avx_false))
Prints:
0.000248622
0.00019045
When GCC optimization and amalgamation are enabled, the performances with or without AVX seem to be similar (which is also disappointing).
In [14]:
large_nondense <- all_results[all_results["large1"]=="True" & all_results["large2"]=="True" & all_results["dense1"]=="False" & all_results["dense2"]=="False",]
large_nondense_aov <- aov(time~(copy_on_write+run_containers+amalgamation+gcc_optimization+avx_enabled)^2, data=large_nondense)
summary(large_nondense_aov)
In [15]:
layout(matrix(c(1,2,3,4),2,2))
plot(large_nondense_aov)
The plots are here better than the first one, but not perfect. The points are still not uniformly distributed on the fitted values vs residuals plot, and the QQ-plot is not a straight line.
The summary shows that again, the factors gcc_optimization, avx_enabled and amalgamation have a significant impact on the performances.
In [16]:
MEPlot(large_nondense_aov, abbrev=4, select=c(1, 2, 3, 4, 5), response="time")
In [17]:
IAPlot(large_nondense_aov, abbrev=4, show.alias=FALSE, select=c(1, 2, 3, 4, 5))
Effect and interaction plots are less surprising here. We see that enabling AVX has a positive impact on the performances (but the main performance gain comes with gcc optimizations, as expected).
In [18]:
large_onedense <- all_results[all_results["large1"]=="True" & all_results["large2"]=="True" & (all_results["dense1"]=="True" | all_results["dense2"]=="True") & (all_results["dense1"]=="False" | all_results["dense2"]=="False"),]
large_onedense_aov <- aov(time~(copy_on_write+run_containers+amalgamation+gcc_optimization+avx_enabled)^2, data=large_onedense)
summary(large_onedense_aov)
In [19]:
layout(matrix(c(1,2,3,4),2,2))
plot(large_onedense_aov)
Again, the plots are not great, but they are still better than for the general case.
With the summary, we see that for this case, the factor copy_on_write is by far the most impactful (F value several orders of magnitude greater than the other significant factors). The other important factors are amalgamation, gcc_optimization and avx_enabled. The factor run_container has also a small impact.
In [20]:
MEPlot(large_onedense_aov, abbrev=4, select=c(1, 2, 3, 4, 5), response="time")
In [21]:
IAPlot(large_onedense_aov, abbrev=4, show.alias=FALSE, select=c(1, 2, 3, 4, 5))
The effect plot confirms that the main performance gain comes from the use of copy on write. This was expected in this case, the non dense roaring bitmap has its elements spread across much more containers than the dense roaring bitmaps. A large number of containers are not modified, therefore there is no need to copy them.
In [22]:
nonlarge <- all_results[all_results["large1"]=="False" & all_results["large2"]=="False",]
nonlarge_aov <- aov(time~(copy_on_write+run_containers+amalgamation+gcc_optimization+avx_enabled)^2, data=nonlarge)
summary(nonlarge_aov)
In [23]:
layout(matrix(c(1,2,3,4),2,2))
plot(nonlarge_aov)
The plots are great, except for two outliers. The points are uniformly distributed around the x axis on the two first plots, and the QQ-plot is a straight line.
The summary shows that the factors gcc_optimization and avx_enabled have a significant impact on the performances.
In [24]:
MEPlot(nonlarge_aov, abbrev=4, select=c(1, 2, 3, 4, 5), response="time")
Only amalgamation and GCC optimization have a noticeable impact on the performances (other optimizations have a low F-value, according to the above summary). Strangely, amalgamation seem to have a negative impact here.
A first AOV on all the factors showed that our data does not fulfill the hypothesis: the variance of the time is not constant.
To overcome this, we fixed the “problematic” variables, i.e. the sizes and the densities of the roaring bitmaps.
Without surprise, the factor gcc_optimization has always a very significant positive impact on the performances.
For large roaring bitmaps, the factor avx_enabled is also significant, but not always in a positive way: using AVX decreases the performances for dense datasets when no gcc optimization is used. No explanation has been found yet. The factor amalgamation is important if at least one of the roaring bitmaps is non dense, and the factor copy_on_write is very important if one and only one of them is dense.
For small roaring bitmaps, the only important factors are gcc_optimization and amalgamation, but the later has a negative impact.