Code for OFSS: calibration and resulting performance


The Flow Sampling Sketch, or FSS, is a skampling method, that is a hybrid between sampling and sketching. The sampling component is simply a front-end i.i.d. Flow Sampling (FS), controlled by a flow selection probability pf, which thins out the incoming flows. The sampled flows are then inserted into a simple sketch, namely a counter array.  Our method OFSS is a just a FSS for which the optimal pf value has been selected, given the input load alpha (average number of flows per counter). The key fact here is that overloading the sketch results in information destruction, so pf must be chosen to keep it at the point of maximum information retention. The code described here allows the optimal pf to be calculated for any given input load alpha, and also gives the associated amount of information, and minimal variance, of any estimator using the OFSS method.

The Code

The tarred and gzip-ed Matlab code is here. For users who have downloaded a previous release and wish to obtain the new functions, it is strongly recommended that the old versions be deleted and the entire set downloaded again, to ensure compatibility. Note that the routines are self contained and do not require any special Matlab toolboxes.

For Unix/Mac users, after downloading first perform:

to decompress the file, followed by to unwrap the functions. A directory called "OFSS" will be created inside containing 4 `file.m' matlab files, each with a single Matlab function.

Copyright, conditions of use

The code is made freely available for non-commercial use only, specifically for research and teaching purposes, provided that the copyright headers at the head of each file are not removed, and suitable citation is made in published work using the routine(s). No responsibility is taken for any errors in the code. The headers in the files contain copyright dates and authors on an individual basis. In addition there is a copyright on the collection as made available here, dated 9 September 2014.

Theoretical Notes


Let the number of counters in the counter array be A, and the number of flows to insert into it in the current measurement interval be Nf.  The input load alpha, which is the average number of flows per counter, is  alpha = Nf/A.  Note that the code below only requires the value of alpha, not the underlying Nf or A.

To use OFSS, we need to calculate the optimal load pf*(alpha). This will be a function of the true flow size distribution theta. To bootstrap, a rough theta, or a model theta, can be used to determine an initial pf*(alpha) to begin measurement. 

It turns out that pf*(alpha) takes a simple form:  pf = min(1, alpha*/alpha),  a function of a critical alpha value alpha*.  The bootstrapping refered to above boils down to the selection of a value of alpha*.  A crude but often very reasonable universal bootstrapping value is alpha*=1.   This could be used to obtain an initial theta estimate, which could then be used as input into  OFSS_alphastar.m  below.

By definition in performance constrained measurement, the input load will be much higher than available resources. For example alpha=100 is an oft-quoted number, though 1000 or more is perhaps more reasonable now.   We will then have pf*= alpha*/alpha so that the counter array `sees' an effective load of alpha* instead of alpha.

Alpha* also depends on the size, k, of the flows whose size estimate you (most want) optimised, so a k* must be selected, and then alpha*_k* will be calculated. In practice k*=1 is a good choice, the resulting alpha*_1 will result in close-to-optimal estimates for many other k values also.
 
The code assumes that the flows have a largest possible size W.  In practice one should truncate empirical distributions (histograms) at a level W such that almost all, or all, of the histogram bins k<=W are non-empty.

Notes for use


All files contain a single function of the same name and begin with a detailed comment block. Please refer to these blocks for details of input and output parameters and examples of use.

The calling tree is

OFSS_alphastar.m
     - TGdensity.m
     - CRLB_FSS.m
         - counterdensity.m

OFSS_alphastar.m

CRLB_FSS.m TGdensity.m counterdensity.m

Example of pf*(alpha):   an example of  pf*(alpha) = min(1, alpha*/alpha)   for alpha* calculated for k*=1,2,10,100,  for a uniform theta with W=100.   For k*=1 pf* is very close to 1. 

pf*(alpha) for a uniform distribution with W=50 
 



Last modified: Febuary 7 15:33:41 2020