Code for OFSS: calibration and resulting performance
The Flow Sampling Sketch, or FSS, is a
skampling
method, that is a hybrid between sampling and sketching. The sampling
component is simply a front-end i.i.d. Flow Sampling (FS), controlled
by a flow selection probability pf, which thins out the incoming flows.
The sampled flows are then inserted into a simple sketch, namely a counter array.
Our method OFSS is a just a FSS for which the
optimal pf value has been selected, given the input load alpha (average
number of flows per counter). The key fact here is that overloading the
sketch results in information
destruction, so pf must be chosen to keep
it at the point of maximum information retention.
The code described here
allows the optimal pf to
be calculated for any given input load alpha,
and also gives the associated amount of information, and minimal
variance, of any estimator using the OFSS method.
The Code
The tarred and gzip-ed Matlab code is
here.
For users who have downloaded a previous release and wish to obtain the
new functions, it is strongly recommended that the old versions be deleted
and the entire set downloaded again, to ensure compatibility.
Note that the routines are
self contained and do not require any special Matlab toolboxes.
For Unix/Mac users, after downloading first perform:
to decompress the file, followed by
to unwrap the functions. A directory called "OFSS" will be created
inside containing 4 `file.m' matlab files, each with a single
Matlab function.
Copyright, conditions of use
The code is made freely available for non-commercial use only, specifically
for research and teaching purposes, provided that the copyright headers
at the head of each file are not removed, and suitable citation is made
in published work using the routine(s). No responsibility is taken for
any errors in the code. The headers in the files contain copyright dates
and authors on an individual basis. In addition there is a copyright on
the collection as made available here, dated 9 September 2014.
Theoretical Notes
Let the number of counters in the counter array be A, and the number of
flows to insert into it in the current measurement interval be
Nf. The input load alpha, which is the average number of flows
per counter, is alpha = Nf/A. Note that the code below only
requires the value of alpha, not the underlying Nf or A.
To use OFSS, we need to calculate the optimal load pf*(alpha). This
will be a function of the true flow size distribution theta. To
bootstrap, a rough theta, or a model theta, can be used to determine an
initial pf*(alpha) to begin measurement.
It turns out that pf*(alpha) takes a simple form: pf = min(1,
alpha*/alpha), a function of a critical alpha value alpha*.
The bootstrapping refered to above boils down to the selection of a
value of alpha*. A crude but often very reasonable universal
bootstrapping value is alpha*=1. This could be used to
obtain an initial theta estimate, which could then be used as input
into OFSS_alphastar.m below.
By definition in performance constrained measurement, the input load
will be much higher than available resources. For example alpha=100 is
an oft-quoted number, though 1000 or more is perhaps more reasonable
now. We will then have pf*= alpha*/alpha so that the
counter array `sees' an effective load of alpha* instead of alpha.
Alpha* also depends on the size, k, of the flows whose size estimate
you (most want) optimised, so a k* must be selected, and then alpha*_k*
will be calculated. In practice k*=1 is a good choice, the resulting
alpha*_1 will result in close-to-optimal estimates for many other k values also.
The code assumes that the flows have a largest possible size W.
In practice one should truncate empirical distributions (histograms) at
a level W such that almost all, or all, of the histogram bins k<=W are non-empty.
Notes for use
All files contain a single function of the same name and begin with a
detailed comment block. Please refer to these blocks for details of
input and output parameters and examples of use.
The calling tree is
OFSS_alphastar.m
- TGdensity.m
- CRLB_FSS.m
- counterdensity.m
OFSS_alphastar.m
- Calculates the critical load value value alpha*(k*,theta) defining pf*(alpha) and hence OFSS = FSS(pf*)
CRLB_FSS.m
- Calculates the per-counter Fisher Information Matrix J of FSS,
and the associated per-counter Covariance Matrix (the Cramer Rao Lower
Bound), given the chosen pf, alpha and theta.
- This function is called by OFSS_alphastar but can also be used as
a standalone function when the desired value of pf is already known.
- Using pf* as input give the J and CRLB of OFSS.
TGdensity.m
- Simple function to calculate a Truncated Geometric distribution of support W, if a model theta is needed.
counterdensity.m
- Calculates the density c(j) of the number of packets j in a given counter, given theta (model based or empirical) and alpha.
Example of pf*(alpha): an
example of pf*(alpha) = min(1, alpha*/alpha) for
alpha* calculated for k*=1,2,10,100, for a uniform theta with
W=100. For k*=1 pf* is very close to 1.
Last modified: Febuary 7 15:33:41 2020