Calculate least-cost sample sizes for 2-stage surveys for demonstrating disease freedom,
where cluster sizes are unknown.
This analysis calculates the number of clusters and the number of units within each cluster
to be tested to provide a specified system sensitivity (probability of detecting disease)
for the given unit and cluster-level design prevalences and test sensitivity,
where actual cluster sizes are unknown. Test
specificity is assumed to be 100% (or follow-up testing of any positive will be undertaken
to confirm or exclude disease).
Sample sizes are optimised to minimise overall cost for given cluster and unit-level
testing costs. A maximum sample size per cluster must be specified and either
the number of cluster in the population or a maximum number of clusters to be tested
must be specified.
Numbers of units to test in each cluster are calculated using assumed binomial sampling
(sample size is small relative to cluster sizes), while numbers of clusters to test are
calculated using the hypergeometric
distribution approximation (sampling without replacement) if the number of clusters in the
population is specified or assuming binomial sampling if not.
Design prevalence (specified level of disease to be detected) must be specified
at both unit and cluster levels. Design prevalence can be specified as either:
- a proportion of the population infected; or
- a specific (integer) number of clusters infected (for cluster-level prevalence only and
only if the number of clusters in the population is specified).
Inputs required include:
- unit-level design prevalence (as a proportion only);
- cluster-level design prevalence as either a proportion or an integer number of clusters;
- the estimated test sensitivity;
- the relative (or actual) cost of testing at both cluster and unit levels;
- the target system sensitivity (SSe) which is the probability of detecting
disease if it is present at the specified design prevalences;
- the maximum sample size to be tested per clusters; and
- The number of clusters in the population OR the maximum number of clusters
to be tested.
Outputs from the analysis include:
- A summary of the total numbers of clusters and units to be sampled,
target number of units to test per cluster, estimated SeH per cluster and the achieved SSe;
- A summary of numbers of clusters to be tested and corresponding numbers of
units to test in each cluster, the estimated SeH and the relative cost for each option; and
- An excel spreadsheet and graph of the summary results.
If it is not possible to achieve the desired system sensitivity by testing the
specified maximum number of units in all (or the specified maximum number) of the
clusters, a message will be
returned, along with a summary of the achieved mean SeH and SSe if the maximum numbers
of units and clusters were