Submit manuscript...
Journal of
eISSN: 2378-3184

Aquaculture & Marine Biology

Research Article Volume 4 Issue 5

Bayesian Estimation of the Number of Individuals in a Sample with a Known Weight

Samu M,1 Atso Romakkaniemi,2 Elja Arjas3

1Department of Mathematics and Statistics, University of Helsinki, Finland
2Natural Resources Institute Finland
3Department of Mathematics and Statistics, University of Helsinki, Finland

Correspondence: Samu Mäntyniemi, Department of Environmental Sciences, University of Helsinki, Finland

Received: October 27, 2016 | Published: November 15, 2016

Citation: DOI: 10.15406/jamb.2016.04.00095

Download PDF

Abstract

We introduce a Bayesian probability model for making inferences about the unknown number of individuals in a sample, based on known sample weight and on information provided by subsamples with known weights and corresponding counts. Inherent in the Bayesian approach, the model allows for an incorporation of prior information that is often available about the sample size and other uncertain parameter values. As a result, the model provides an estimate of the number of individuals in the sample in the form of a posterior probability distribution that includes both the prior information and the interpretation of the observed data. Such a result cannot be obtained using the frequentist approach. The model presented here can be applied to a wide range of similar problems. Here our main focus is stock assessment, where the task is the conversion of the catch weight into the number of individuals in the catch. The model is easy to use due to availability of general purpose MCMC simulation software, and it can be used either in a standalone fashion or embedded into more complex probability models.

Keywords: Catch samples; Markov chain monte carlo (MCMC); Sample size; Weight distribution

Introduction

Counting the number of individuals in a large sample can be very laborious or impractical. Instead of exact counting of all individuals, the size of a large sample can be estimated by weighing it and then using information about the mean weight of individuals obtained from smaller samples. Estimation approaches based on this general idea have a number of natural applications in aquatic sciences. For instance, the number of fish in a commercial catch is often estimated in this way in order to obtain data that would be suitable for typical stock assessment methods [1,2]. Further, the approach is apparently commonly used in the estimation of the number of fish raised in fish hatcheries for subsequent stocking [3]. Besides estimating the number of fish, the approach has been applied in e.g., estimating the number of eggs in fish gonads [4]. However, estimates obtained this way will always involve an element of uncertainty unless all individuals in the sampled population are of exactly equal weight, and unless measurement error is negligible. For a systematic analysis, this uncertainty should be taken into account in all further considerations utilizing these estimates. In fisheries science such uncertainty has been often neglected, or frequentist methods have been applied. Unfortunately frequentist methods cannot provide measurements of uncertainty about parameters, even though the results of the frequentist analysis are often phrased in a way which invites such a misinterpretation.

The Bayesian approach to statistical inference provides a flexible framework for working with multiple levels of uncertainty, and is therefore becoming increasingly popular in fisheries science. In the Bayesian approach, uncertainty is described by assigning probability distributions to quantities whose values are uncertain. Thus, in particular, uncertain values of the sample size are presented in terms of corresponding probability distributions. This paper presents the development of a Bayesian probability model which can be used to derive probabilistic estimates of the size of a sample of a known weight. The model structure is introduced by considering samples of fish, but the same logic applies to any kind of comparable items like invertebrates, plants, stones etc.

Sampling

A sample containing N fish can be obtained from a fish population either by simple random sampling or by some form of selective sampling. In a typical case N is large, but the theory presented here holds in principle for any positive integer value. On the other hand, the population from which the sample is drawn is assumed to be infinitely large. For example, fish living in a particular rearing pond are seen as a sample from the potentially infinitely large population of similar fish that could be produced in the given pond and under conditions similar to present ones.

The sample of size N is then divided without selection into k +1 subsamples, of which one is typically large compared to the others. Here we denote the number of fish in the large sub sample by n* and the number of fish in the smaller Sub samples by nj, j = 1,…,k, assuming then that

n*+ k j=1 n j =N MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqzGeGaamOBai aacQcacqGHRaWkjuaGdaaeabqcLbsaeaqabKGbagaajugWaiaadUga aKGbGfaajugWaiaadQgacqGH9aqpcaaIXaaaaKGbagqabeqcLbsacq GHris5aiaad6galmaaBaaajuaGbaqcLbmacaWGQbaajuaGbeaajugi biabg2da9iaad6eaaaa@4AF5@

The weight (s*) of the large subsample and the weights (sj) of the smaller subsamples are assumed to have been measured accurately enough to be treated as known, as well as the number of fish from corresponding counts of the smaller subsamples. As a consequence, only n* remains unknown and needs to be estimated. If sampling from the fish population is made without any form of size selection, samples of sizes n*, n1,….,nk can be obtained as independent samples directly from the population, and in any order.

Probability Model

We begin by making the assumption that the weights (wi; i=1,…,N) of individual fish are exchangeable for all values of N. This assumption means in particular that the joint predictive distribution of the weights, describing beliefs about the weights of the fish in the sample, is always the same regardless which particular N fish would have been sampled, and how they would be ordered within the sample. According to the celebrated representation theorem of de Finetti, the assumption of exchangeability allows us to write the joint predictive distribution (density) in the form

p( w i , w 2 ...., w N )= f [ i=1 N f( w i ) ]dQ(f) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamiCa8aacaGGOaWdbiaadEhapaWaaSbaaSqaa8qadaWgaaadbaGa amyAaaqabaWccaGGSaaapaqabaGcpeGaam4DamaaBaaajeaibaGaaG OmaaqabaGccaGGUaGaaiOlaiaac6cacaGGUaGaaiilaiaadEhapaWa aSbaaKqaGeaapeGaamOtaaWdaeqaaOGaaKykaKqbakaaj2dadaWdba qaamaaBaaajuaibaGaamOzaaqcfayabaaabeqabiabgUIiYdWaamWa aeaadaqeWbqaa8qacaWGMbWdaiaajIcapeGaam4Da8aadaWgaaqcfa saaiaajMgaaeqaaaqaaiaajMgacaqI9aGaaKymaaqaaiaaj6eaaKqb akabg+GivdGaaKykaaGaay5waiaaw2faaiaadsgacaWGrbGaaiikai aadAgacaGGPaaaaa@5959@    (1)

Where f is an unknown density function, and Q(f) denotes a probability measure over all distribution functions. This can be interpreted as if we had N independent fish weights taken from an unknown weight distribution function f, which again is assigned a prior probability distribution Q. The operational interpretation of Q (f) is then ”what we believe the empirical weight distribution would look like for a large sample” [5]. Our second assumption is a convention which we make to simplify the analysis: we restrict the set of possible weight distribution functions to parametric distribution families F, for which it holds that if the individual weights wi follow independently a fixed distribution f belonging to distribution family F, then also arbitrary finite sums w i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcfa4aaabqae aacaWG3bWaaSbaaKqbGeaacaWGPbaajuaGbeaaaeqabeGaeyyeIuoa aaa@3B47@  follow a distribution which belongs to the same distribution family. Gamma and Normal families are known to satisfy this condition, and from now on we assume that F is either of these two families. Within the chosen family (Gamma or Normal), prior uncertainty about the weight distribution can be expressed by assigning a prior distribution to two parameters of the distribution family. Here we use mean μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTb aa@382F@ and standard deviation σ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcfaOaeq4Wdm haaa@3847@ , but other parameterizations can be used as well. Then the joint predictive distribution of the individual weights can be written as

p( w 1 , w 2 ,... w N )= μ σ [ i=1 N f( w i |μ,σ ) ]p( μ,σ )dσdμ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadchada qadaqaaiaadEhadaWgaaqcfasaaiaaigdaaeqaaKqbakaacYcacaWG 3bWaaSbaaKqbGeaacaaIYaaabeaajuaGcaGGSaGaaiOlaiaac6caca GGUaGaam4DamaaBaaajuaibaGaamOtaaqcfayabaaacaGLOaGaayzk aaGaeyypa0Jaey4kIi=aaSbaaKqbGeaacqaH8oqBaKqbagqaaiabgU IiYpaaBaaajuaibaGaeq4WdmhajuaGbeaadaWadaqaamaarahabaGa amOzamaabmaabaGaam4DamaaBaaajuaibaGaamyAaaqabaqcfaOaai iFaiabeY7aTjaacYcacqaHdpWCaiaawIcacaGLPaaaaKqbGeaacaWG PbGaeyypa0JaaGymaaqaaiaad6eaaKqbakabg+GivdaacaGLBbGaay zxaaGaamiCamaabmaabaGaeqiVd0Maaiilaiabeo8aZbGaayjkaiaa wMcaaiaadsgacqaHdpWCcaWGKbGaeqiVd0gaaa@6CB0@ (2)

where f(wi|μ,σ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadAgaca GGOaGaam4DaKqbGiaadMgacaGG8bqcfaOaeqiVd0Maaiilaiabeo8a ZjaacMcaaaa@408C@ is the Gamma or Normal density, and p(μ,σ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadchaca GGOaGaeqiVd0Maaiilaiabeo8aZjaacMcaaaa@3CF0@ is a joint prior distribution of its parameters. Because individual weights are conditionally independent given μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTb aa@382F@  and , the conditional expected value and conditional standard deviation of the sample weight are E( s j | n j μ, σ 2 )= n j μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadweaca GGOaGaam4CamaaBaaajuaibaGaamOAaaqcfayabaGaaiiFaiaad6ga daWgaaqcfasaaiaadQgaaKqbagqaaiabeY7aTjaacYcacqaHdpWCda ahaaqabKqbGeaacaaIYaaaaKqbakaacMcacqGH9aqpcaGGUbGcdaWg aaWcbaWaaSbaaWqaaiaadQgaaeqaaaWcbeaajuaGcqaH8oqBaaa@4A7B@ and SD( s j | n j μ, σ 2 )= n j σ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaacofaca GGebGaaiikaiaadohadaWgaaqcfasaaiaadQgaaKqbagqaaiaacYha caWGUbWaaSbaaKqbGeaacaWGQbaajuaGbeaacqaH8oqBcaGGSaGaeq 4Wdm3aaWbaaeqajuaibaGaaGOmaaaajuaGcaGGPaGaeyypa0ZaaOaa aeaacaWGUbWaaSbaaKqbGeaacaWGQbaajuaGbeaaaeqaaiabeo8aZb aa@4B4F@ . This implies that the joint predictive distribution for the sample weights, given the sample sizes n1,….,nk can be written in the form

p( s 1 ,...., s k ,s*| n 1 ,....,nk )= μ σ n * [ j=1 N f( s j | n j μ, n k σ ) ] MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadchada qadaqaaiaadohadaWgaaqcfasaaiaaigdaaeqaaKqbakaacYcacaGG UaGaaiOlaiaac6cacaGGUaGaaiilaiaacohadaWgaaqcfasaaiaadU gaaeqaaKqbakaacYcacaGGZbqcfaIaaiOkaiaacYhajuaGcaWGUbWa aSbaaKazfa4=baGaaGymaaqcfayabaGaaiilaiaac6cacaGGUaGaai Olaiaac6cacaGGSaGaamOBaKazfa4=caWGRbaajuaGcaGLOaGaayzk aaGaeyypa0Jaey4kIi=aaSbaaKqbGeaacqaH8oqBaKqbagqaaiabgU IiYpaaBaaajuaibaGaeq4WdmhajuaGbeaadaWdbaqaamaaBaaajuai baGaamOBaaqcfayabaqcfaIaaiOkaaqcfayabeqacqGHRiI8amaadm aabaWaaebCaeaacaWGMbWaaeWaaeaacaWGZbWaaSbaaKqbGeaacaWG QbaabeaajuaGcaGG8bGaaiOBamaaBaaajuaibaGaamOAaaqcfayaba GaeqiVd0MaaiilamaakaaabaGaamOBamaaBaaajqwba+FaaiaadUga aKqbagqaaaqabaGaeq4WdmhacaGLOaGaayzkaaaajuaibaGaamOAai abg2da9iaaigdaaeaacaWGobaajuaGcqGHpis1aaGaay5waiaaw2fa aaaa@7C44@

×f(s*|n*μ, n*σ )p(μ,σ,n*)dn*dσdμ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabgEna0k aadAgacaGGOaGaam4CaKqbGiaacQcajuaGcaGG8bGaaiOBaKqbGiaa cQcajuaGcqaH8oqBcaGGSaWaaOaaaeaacaWGUbqcfaIaaiOkaiabeo 8aZbqcfayabaGaaiykaabaaaaaaaaapeGaamiCa8aacaGGOaGaeqiV d0Maaiilaiabeo8aZjaacYcacaGGUbqcfaIaaiOkaiaacMcajuaGca WGKbGaamOBaKqbGiaacQcajuaGcaGGKbGaeq4WdmNaaiizaiabeY7a Tbaa@5969@  (3)

This model can be also specified by the following sequence of definitions:

s j |μ, σ 2 , n j D( n j μ, n j σ),j=1,....,k, MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadohada WgaaqcfasaaiaadQgaaKqbagqaaiaacYhacqaH8oqBcaGGSaGaeq4W dm3aaWbaaeqajqwba+FaaiaaikdaaaqcfaOaaiilaiaad6gadaWgaa qcfasaaiaadQgaaKqbagqaaiablYJi6iaadseacaGGOaGaamOBamaa BaaajuaibaGaamOAaaqcfayabaGaeqiVd0MaaiilamaakaaabaGaam OBamaaBaaajuaibaGaamOAaaqcfayabaaabeaacqaHdpWCcaGGPaGa aiilaiaacQgacqGH9aqpcaaIXaGaaiilaiaac6cacaGGUaGaaiOlai aac6cacaGGSaGaai4AaiaacYcaaaa@5B5A@

s*|μ, σ 2 ,n*D(n*μ, n* σ), MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadohaju aicaGGQaqcfaOaaiiFaiabeY7aTjaacYcacqaHdpWCdaahaaqabKaz fa4=baGaaGOmaaaajuaGcaGGSaGaaiOBaKqbGiaacQcajuaGcqWI8i IocaWGebGaaiikaiaad6gajuaicaGGQaqcfaOaeqiVd0Maaiilamaa kaaabaGaamOBaKqbGiaacQcaaKqbagqaaiabeo8aZjaacMcacaGGSa aaaa@515B@  (4)

n*D(,), MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaac6gaju aicaGGQaqcfaOaeSipIOJaamiraiaacIcacaGGSaGaaiykaiaacYca aaa@3D80@

μD(,), MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTj ablYJi6iaadseacaGGOaGaaiilaiaacMcacaGGSaaaaa@3CDA@

σD(,), MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeo8aZj ablYJi6iaadseacaGGOaGaaiilaiaacMcacaGGSaaaaa@3CE7@

where D(mean; standard deviation) in each case denotes a suitable prior probability distribution. The Normal distribution can be used for sample weights if all nj:s are large, as the distribution of the sum of independent random variables approaches the Normal distribution when nj increases, regardless of the shape of the distribution of individual weights. It should be noted, however, that the Normal distribution allows also for negative weights, which is not realistic. Prior distributions for μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTb aa@382F@ , σ 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeo8aZn aaCaaabeqcfasaaiaaikdaaaaaaa@3948@ and n* MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaad6gaju aicaGGQaaaaa@3848@ can have any shapes, as long as it is recognized that they can have only positive values. If the sample weights (s1,…sk, s* ) are not observed without non-negligible error, the model can be extended to account for measurement error by treating the true sample weights as unknown and by adding an extra layer to the model specification:

m j | s j , v j D( s j , v j ),j=1,.....,k, MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaad2gada WgaaqcfasaaiaadQgaaKqbagqaaiaacYhacaGGZbWaaSbaaKqbGeaa caWGQbaajuaGbeaacaGGSaGaaiODamaaBaaajuaibaGaamOAaaqcfa yabaGaeSipIOJaamiraiaacIcacaWGZbWaaSbaaKqbGeaajuaGdaWg aaqcfasaaiaadQgaaKqbagqaaaqabaGaaiilaiaadAhadaWgaaqcfa saaiaadQgaaKqbagqaaiaacMcacaGGSaGaamOAaiabg2da9iaaigda caGGSaGaaiOlaiaac6cacaGGUaGaaiOlaiaac6cacaGGSaGaam4Aai aacYcaaaa@54AD@

m*|s*,v*D(s*,v*), MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaad2gaju aicaGGQaqcfaOaaiiFaiaacohajuaicaGGQaqcfaOaaiilaiaacAha juaicaGGQaqcfaOaeSipIOJaamiraiaacIcacaWGZbqcfaIaaiOkaK qbakaacYcacaWG2bqcfaIaaiOkaKqbakaacMcacaGGSaaaaa@48BC@  (5)

where observed weight measurements (m1…mk, m* ) and corresponding standard deviations (v1…vk, v*) are assumed to be known. The form of the measurement error distribution can be in principle chosen in any way that would seem appropriate in the given context. Information about the shape of the distribution and about its standard deviation could come from expert judgement and/or from an independent study of the measurement error. Despite the simple model structure, the posterior distribution is analytically intractable and approximation methods are needed for a numerical evaluation of the probabilities of interest. Our approach is to use Markov chain Monte Carlo (MCMC) simulation [6] to draw a large number of samples from the posterior distribution, and use corresponding sample averages as summaries of the posterior distribution. This task can be accomplished easily by using a general purpose MCMC software, like Win BUGS [7].

Example: Number of fish in a rearing pond

Suppose that all N fish were captured from the rearing pond and moved to a tank. We begin by making the assumption that the weights wi of the fish in the rearing pond are a conditionally independent sample from a Gamma distribution characterized by unknown mean and coefficient of variation δ=100 σ μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabes7aKj abg2da9iaaigdacaaIWaGaaGimamaaliaabaGaeq4WdmhabaGaeqiV d0gaaaaa@3EDE@ . The sample of size N was then divided into four subsamples, of which one has unknown size n*, and the others are of known sizes n1=108, n2=101, n3=115. It is also assumed that the manufacturer of the scale has specified that the observed weights vary symmetrically around the true weight with standard deviation of 10g. Here we use a Normal distribution to describe the variation of the measurements (mj, m*) around the true value. Prior distributions for model parameters N, μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTb aa@382F@ and were obtained by interviewing an expert who is familiar with local aquacultural practices. He was told that the rearing pond had bottom area of 50m2, it was located in Northern Finland and contained two-year-old salmon smolts. We formalised his prior beliefs by the following prior distributions:

uGamma(6840,3520) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadwhacq WI8iIocaWGhbGaamyyaiaad2gacaWGTbGaamyyaiaacIcacaaI2aGa aGioaiaaisdacaaIWaGaaiilaiaaiodacaaI1aGaaGOmaiaaicdaca GGPaaaaa@450D@

N=2000u MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaad6eacq GH9aqpcaaIYaGaaGimaiaaicdacaaIWaGaeyOeI0IaamyDaaaa@3D23@ (6)

μGamma(46.2,7.51) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTj ablYJi6iaadEeacaWGHbGaamyBaiaad2gacaWGHbGaaiikaiaaisda caaI2aGaaiOlaiaaikdacaGGSaGaaG4naiaac6cacaaI1aGaaGymai aacMcaaaa@45B6@

δGamma(24.13,10). MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabes7aKj ablYJi6iaadEeacaWGHbGaamyBaiaad2gacaWGHbGaaiikaiaaikda caaI0aGaaiOlaiaaigdacaaIZaGaaiilaiaaigdacaaIWaGaaiykai aac6caaaa@4597@

In addition to the above specification, parameter u was constrained to lie in the interval [0,20 000] and parameter δ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabes7aKb aa@381E@ in the interval [5,60], that is, within these intervals the prior probability density is proportional to the distributions specified above, and is zero elsewhere. Parameter u was used as an auxiliary variable in order to obtain a left-skewed prior density for N. The rest of the model was specified by the equations

m j | s j N( s j ,10) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaad2gada WgaaqcfasaaiaadQgaaKqbagqaaiaacYhacaWGZbWaaSbaaKqbGeaa caWGQbaajuaGbeaacqWI8iIocaWGobGaaiikaiaadohadaWgaaqcfa saaiaadQgaaKqbagqaaiaacYcacaaIXaGaaGimaiaacMcaaaa@4539@

m * | s * N( s * ,10) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaad2gada WgaaqcfasaaKqbaoaaCaaabeqcfasaaKqbaoaaCaaabeqcfasaaiaa cQcaaaaaaaqcfayabaGaaiiFaiaadohadaahaaqabKqbGeaacaGGQa aaaKqbakablYJi6iaad6eacaGGOaGaam4CamaaCaaabeqcfasaaiaa cQcaaaqcfaOaaiilaiaaigdacaaIWaGaaiykaaaa@4634@ (7)

s * |μ, σ 2 n * Gamma(n*μ, n* δμ), MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadohada ahaaqabKqbGeaacaGGQaaaaKqbakaacYhacqaH8oqBcaGGSaGaeq4W dm3aaWbaaeqajuaibaGaaGOmaaaajuaGcaWGUbWaaWbaaeqajuaiba GaaiOkaaaajuaGcqWI8iIocaWGhbGaamyyaiaad2gacaWGTbGaamyy aiaacIcacaGGUbqcfaIaaiOkaKqbakabeY7aTjaacYcadaGcaaqaai aad6gajuaicaGGQaaajuaGbeaacqaH0oazcqaH8oqBcaGGPaGaaiil aaaa@5477@

s j |μ, σ 2 n j Gamma( n j μ, n j δμ), MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadohada WgaaqcfasaaiaadQgaaKqbagqaaiaacYhacqaH8oqBcaGGSaGaeq4W dm3aaWbaaeqajuaibaGaaGOmaaaajuaGcaWGUbWaaSbaaKqbGeaaca WGQbaajuaGbeaacqWI8iIocaWGhbGaamyyaiaad2gacaWGTbGaamyy aiaacIcacaGGUbWaaSbaaKqbGeaacaWGQbaajuaGbeaacqaH8oqBca GGSaWaaOaaaeaacaWGUbWaaSbaaKqbGeaacaWGQbaajuaGbeaaaeqa aiabes7aKjabeY7aTjaacMcacaGGSaaaaa@55BB@

n * =N j=1 3 n j MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaad6gada ahaaqabKqbGeaacaGGQaaaaKqbakabg2da9iaad6eacqGHsisldaae WbqcfasaaKqbakaad6gadaWgaaqcfasaaiaadQgaaKqbagqaaaqcfa saaiaadQgacqGH9aqpcaaIXaaabaGaaG4maaqcfaOaeyyeIuoaaaa@458C@

j=1,.....,3 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadQgacq GH9aqpcaaIXaGaaiilaiaac6cacaGGUaGaaiOlaiaac6cacaGGUaGa aiilaiaaiodaaaa@3EC0@

The observed weights of the samples were m*=451 360g, m1=4200g, m2=4300g and m3=4500g. Posterior distributions for μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTb aa@382F@ δ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabes7aKb aa@381E@  and N were calculated by using Win BUGS. The posterior distribution for N describes the uncertainty about the number of fish in the rearing pond (Figure 1a). The 95% probability interval (PI) of the number of fish in the tank is [11130, 12030], and the most probable number (maximum a posteriori (MAP) estimate) is about 11570. If the group of fish in the tank is meant to be a mandatory release group of at least 12000 smolts, it might be of interest to calculate the probability that there are 12000 fish or more in the tank. This can be calculated during the MCMC simulation or from the resulting posterior density. In this case the probability is 0.03, carrying the message that it is unlikely that the targeted release number would have been reached. Nearly identical results were obtained by assuming that the individual weights are Normally distributed (PI=[11 090,12 020], MAP=11 550, P (N MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabgwMiZc aa@383F@ 12 000, data) = 0.03) (Figure 1). The posterior distribution for the mean weight (μ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaacIcacq aH8oqBcaGGPaaaaa@3988@ is also very informative compared to its prior distribution (Figure 1b). However, the posterior distribution of the coefficient of variation (δ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaacIcacq aH0oazcaGGPaaaaa@3977@ has not been updated from its prior distribution as strongly as the other parameters (Figure 1c). This reflects the fact that only three subsamples were to be used for calibration, with the consequence that there is not much information about the variance of the weight of individual fish in this data set. It also emphasizes the importance of prior information in situations in which the data are sparse.

Figure 1: Posterior distributions, obtained by using a Gamma model (solid line) and a Normal model (short dashed line), and prior distributions (dashed line) of the number (N) of fish in the tank (a), of the mean ( μ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeY7aTb aa@382F@ ) weight (b), and of the coefficient of variation ( δ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabes7aKb aa@381E@ ) of the weight (c), based on the knowledge about the total weight of fish in the rearing pond and on information from three subsamples.

Discussion

Why Bayesian?

The model presented in this paper endeavors to answer a simple question: “Given my past experience and samples obtained, what should I think about the number of individuals in the large sample?”. Basing on the idea to use the concept of probability as a measure of personal degree of belief, the Bayesian approach is capable of answering such a question. All we need to do is to formalise in terms of probability what our past experience says about the distribution of the weight and about the number of individuals. The update of beliefs is obtained by applying the rules of probability calculus, and a quantitative answer to the original question is obtained in the form of the posterior distribution for the number of individuals in the large sample, providing an updated degree of belief in each possible value of the number of individuals. The frequentist approach, however, cannot provide a quantitative answer to the question. It is well known by statisticians, but not equally well appreciated by many applied scientists, that the frequentist approach deals only with the conditional distribution of observations given that the parameter values were known. The question for which the frequentist approach does provide a formal answer could then be stated as: “Given my past experience and a conjectured number of individuals in the large sample, what kind of samples could I expect to see if I repeatedly sampled the population for a very large number of times”. This question is quite different from the direct question concerning the unknown correct number of individuals in the sample. However, these two questions are obviously related, as it makes sense to believe more in numbers that would lead to data like those observed more frequently than in numbers that would make the observed data look more rare under the assumed sampling distribution. Thus, the result of the frequentist analysis can be intuitively connected to the question of actual interest, but the idea of direct probabilistic inference about the unknown number of individuals is lost.

Why not to compare to frequentist results?

The numerical values of the frequentist confidence intervals and point estimates may sometimes be close to those of corresponding Bayesian posterior probability intervals and MAP estimates. This does not mean, however, that the choice of the approach would then not matter [8] despite the similar values they are answers to different questions. Existence of such claims indicates that the results of one approach or the other have been misinterpreted. More commonly, the results of a frequentist analysis are interprested as if they were the results of a Bayesian analysis [9]. For the above reasons we argue that direct comparison between the results of Bayesian and frequentist analysis is pointless. However, many scientific journals seem to insist on such a comparison when results of Bayesian models are presented, thus increasing the risk that the conceptual differences between the two approaches become completely confused.

Why bother to specify informative priors?

It might seem that the prior distributions of model parameters did not have much influence on the resulting inference about the unknown size of the large sample, because prior distributions of the mean weight and the population size happened to be relatively flat compared to resulting posterior distribution. However, the prior for the coefficient of variation has been important for the resulting posterior. The marginal likelihoods of the mean weight and the population size both obviously depend on the information about the variation of the weight and the observed data did not contain much information about that. Being wise afterwards, one could claim that it would have been sufficient to elicit expert information only about the variation of the weight and only specify vague priors for the other parameters. In this case the conclusions about the population size would have been practically the same, but the key thing to note is that it really only applies to this particular data and initial information. In order to provide honest updating of knowledge, the prior distributions and the model structure should be specified to reflect the state of information before obtaining data. At that stage it is unknown what kind of data points will be observed, and thus it is unknown how much the posterior distribution will in the end depend on the prior opinion.

Vague or reference priors have often been suggested to make the Bayesian analysis objective and to let data to speak for themselves, or to represent initial lack of information. At least in the context of the problem dealt in this paper, such ideas would lead to quite obscure situations. In any conceivable real application of the model presented here, the researcher using the model will know what items she or he is considering. Thus, depending on the details (species and age of animals, for example) given about the items and on her or his past experience about the items, there will be some information about the mean weight and the variation of the weight, as well as about the shape of the weight distribution. The fact that statistical inference about the number of individuals is required already tells that it is thought to be so large that it is not worthwhile to try count the items exactly. Would the inference about the number of individuals become independent of the researcher’s beliefs (objective) if she or he used vague reference priors as if pretending to know nothing about the number individuals, their mean weight and the variation of the weight? Obviously not. The inference would then be dominated by the likelihood function, which is just a statement of her or his conditional prior beliefs about data given the parameter values and viewed as a function of parameters [10]. The role of this subjective assumption about the shape of the weight distribution becomes more and more important as the number of samples taken from the population increases because the likelihoods imposed by each data point are multiplied with each other. Thus, there is no way around subjectivity in this context, nor in the statistical analysis as a whole.

Further development

The Bayesian model presented here can be used as a building block in more complex Bayesian models. For example, a model which describes the survival, harvest and reproduction of reared fish would need this type of model structure for the estimation of the number of stocked fish, the number of fish caught and the number of eggs from gonad samples. It could also be plugged into a stochastic VPA [11, 12] to account for uncertainty about catches. When subsamples consist of only a single fish each, the inferences will generally be sensitive to the assumed shape of the weight distribution. If in doubt, one could consider extending the present model and apply non parametrically defined weight distributions [13]. However, when each subsample contains larger amounts of fish, the assumed shape no longer plays a major role. This is because, when the number of fish in a subsample increases, and regardless of the shape of the weight distribution, the distribution of the sum of the weights resembles more and more a Normal distribution. On the other hand, for right-skewed weight distributions and small sample sizes the Gamma distribution can be regarded as safer choice. In our example the number of fish in each subsample was large enough to make the results robust to the choice between Gamma and Normal distributions. If all individuals were assumed to be of equal weight, and only measurement error was assumed to be present, then the problem could be seen as an estimation of a ratio parameter and methods proposed by Raftery & Schweder [14] could be used. Subsamples of different sizes can be used at the same time in the analysis. For example, individual weights and weights of subsamples consisting of hundreds of fish can be utilized jointly. Finally, prior distributions of model parameters can be given a hierarchical structure in order to transfer information between exchange-able units, like fish farms, rearing ponds, or spawners.

Acknowledgements

This paper has been partly funded by EU projects”Creation of multiannual management plans for commitment” (project nr SSP8-CT-2003-502289) and ”Oper-ational evaluation tools for fisheries management options” (project nr SSP8-CT-2003-502516). The study was also supported by the Graduate School of Computational Biology, Bioinformatics and Biometry (ComBi) and by the centre of Population Genetic Analyses supported by the Academy of Finland (project nr 53297).

Parameterization of the Gamma distribution

The Gamma distribution is parameterized in this paper in terms of the mean and the standard deviation. The probability density function of a Gamma distributed variable x is

p(x|α,β= β α τ(α) x α1 e βx MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakaadchaca GGOaGaamiEaiaacYhacqaHXoqycaGGSaGaeqOSdiMaeyypa0ZaaSaa aeaacqaHYoGydaahaaqabKqbGeaacqaHXoqyaaaajuaGbaGaeqiXdq Naaiikaiabeg7aHjaacMcaaaGaamiEamaaCaaajuaibeqaaiabeg7a HjabgkHiTiaaigdaaaqcfa4aaSbaaeaacaWGLbaabeaacqGHsislcq aHYoGycaWG4baaaa@51B0@

α= v 2 w 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabeg7aHj abg2da9maalaaabaGaamODamaaCaaabeqcfasaaiaaikdaaaaajuaG baGaam4DamaaCaaabeqcfasaaiaaikdaaaaaaaaa@3DCB@

β= v 2 w 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbb a9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaKqbakabek7aIj abg2da9maalaaabaGaamODamaaCaaabeqcfasaaiaaikdaaaaajuaG baGaam4DamaaCaaabeqcfasaaiaaikdaaaaaaaaa@3DCD@

References

  1. Katia O (2001) Ornamental fish trade. INFOFISH International 3: 14-17.
  2. Saxena A (1994) Health coloration of fish. International Symposium on Aquatic Animal Health: Program and Abstracts. University of California, School of Veterinary Medicine, Davis, CA, USA, pp. 94.
  3. Torrissen OJ (1989) Pigmentation of salmonids: Interaction of astaxanthin and canthaxanthin on pigment deposition in rainbow trout. Aquaculture 79(1-4): 363-374.
  4. Withers PC (1992) Comparative Animal Physiology. Brook Cole-Tomson Learning. Saunders College Publishing/harcourt Brace Jovanovich College, USA, pp. 94.
  5. Goodwin TW (1951) Carotenoids in fish. In: The biochemistry of fish. Biochemical Society Symposia, USA.
  6. Hata M, Hata M (1973) Studies on astaxanthin formation is some freshwater fishes. Tohoku. Journal of Agricultural Research 24(4): 192-196.
  7. Duncan PL, Lovell RT (1993) Natural and synthetic carotenoids enhance pigmentation of ornamental fish. Highlights of agricultural research, Alabama Agricultural Experiment Station 40: 8.
  8. Storebakken T, P Foss, K Schiedt, E Austreng, SL Jensen et al. (1987) Carotenoids in the diets for salmonids IV. Pigmentation of Atlantic salmon with astaxanthin, astaxanthin dipalmitate and canthaxanthin. Aquaculture 65(3-5): 279-292.
  9. Chatzifotis S, Pavlidis M, Jimeno CD, Vardanis G, Sterioti A, et al. (2005) The effect of different carotenoid sources on skin coloration of cultured red porgy (Pagrus pagrus). Aquaculture Research 36: 1517-1525.
  10. Dharmaraj S, Dhevendaran K (2011) Application of microbial carotenoids as a source of colouration and growth of ornamental fish Xiphophorus helleri. World Journal of Fish and Marine Sciences 3(2): 137-144.
  11. Ho ALFC, Zong S, Lin J (2014) Skin color retention after dietary carotenoid deprivation and dominance mediated skin coloration in clown anemonefish, Amphiprion ocellaris. AACL Bioflux 7(2): 103-115.
  12. Sinha A, OA Asimi (2007) China rose (Hibiscus rosa sinensis) petals: a potent natural carotenoid source for goldfish (Carassius auratus L). Aquaculture Research 38(11): 1123- 1128.
  13. Theis A, Salzburger W, Egger B (2012) The function of anal fin egg-spots in the cichlid fish Astatotilapia burtoni. PloS ONE 7(1): e29878.
  14. National Research Council (NRC) (1993) Nutrient requirements of fish. National Academy Press, Washington DC, USA.
  15. Czeczuga B, Dabrowski K, Rosch R, Champinuelle A (1991) Carotenoids in fish. Carotenoids in Coregonus lavaretus L. Individuals of various populations, Acta Ichth. Piscat, 21(2): 3-16.
  16. Foss P, Storebakken T, Liaaen Jensen S. (1987) Carotenoids in diets. V. Pigmentation of rainbow trout and sea trout with astaxanthin. Aquaculture 65(3-4): 293-305.
  17. Ando S (1986) Studies on the food biochemical aspects of changes in chum Salmon, Oncorhychus keta during spawning migration, mechanisms of muscle deterioration and nuptial coloration-Reprinted from memories of Faculty of Fisheries, Kokkaid University 33(1-2): 1-95.
  18. Bjerkeng B, Storebakken T, Liaaen-Jensen S. (1992) Pigmentation of rainbow trout from start feeding to sexual maturation, Aquaculture 108 (3-4): 333-436.
  19. Wozniak M (2000) Carotenoid contents in the body of rainbow trout Oncorhynchus mykiss, from different habitats. Fol Univ Agric Stetin 214 Piscaria 27: 215-220.
  20. Castenmiller JJM, West CE (1998) Bioavailability and bioconversion of carotenoids. Annu Rev Nutr 18: 19-38.
  21. Furr HC, Clark RM (1997) Intestinal absorption and tissue distribution of carotenoids. Nutritional Biochemistry 8(7): 364-377.
  22. Tyssandier V, Lyan B, Borel P (2001) Main factors governing the transfer of carotenoids from emulsion lipid droplets to micelles. Biochimica Biophysica Acta 1533(3): 285-292.
  23. Tanaka Y (1978) Comparative biochemical studies on carotenoids in aquatic animals. Mem Fac Fish 27(2): 355-422.
  24. Torrissen OJ (1986) Pigmentation of salmonids - a comparison of astaxanthin and canthaxanthin as pigment sources for rainbow trout. Aquaculture 53(3-4): 271-278.
  25. Al-Khalifa AS, Simpson KL (1988) Metabolism of astaxanthin in the rainbow trout (Salmo gairdneri). Comparative Biochemistry and Physiology 91(3): 563-568.
  26. Torrissen OJ (1989) Pigmentation of salmonids: Interaction of astaxanthin and canthaxanthin on pigment deposition in rainbow trout. Aquaculture 79(1-4): 363-374.
  27. White DA, Page GI, Swaile J, Moody AJ, Davies SJ (2002) Effect of esterification on the absorption of astaxanthin in rainbow trout, Oncorhynchus mykiss (Walburn). Aquaculture Research 33: 343-350.
  28. March BE, Hajen WE, Deacon G, MacMillan C, Walsh MG (1990) Intestinal absorption of astaxanthin, plasma astaxanthin concentration, body weight, and metabolic rate as determination of flesh pigmentation in salmonids fish. Aquaculture 90(3-4): 313-322.
  29. Choubert G, Milicua JC, Gomez R (1994) The transport of astaxanthin in immature rainbow trout Oncorhynchus mykiss serum. Comparative Biochemistry and Physiology 108(2-3): 245-248.
  30. Parker RS (1996) Absorption, metabolism and transport of carotenoids. FASEB J 10(5): 542-551.
  31. Storebakken T, Hong KN (1992) Pigmentation of rainbow trout. Aquaculture 100(1-3): 209-229.
  32. Hardy RW, Torrissen OJ, Scott TM (1990) Absorption and distribution of C-labelled canthaxanthin in rainbow trout (Oncorhynchus mykiss). Aquaculture 87(3-4): 331-340.
  33. Aas GH, Bjerkeng B, Storebakken T, Ruyter B (1999) Blood appearance, metabolic transformation and plasma transport proteins of C-astaxanthin in Atlantic salmon (Salmo salar L.). Fish Physiology and Biochemistry 21(4): 325-334.
  34. Matsuno T, Tsushima M, Maoka T (2001) Salmoxanthin, deepoxy-salmoxanthin and 7,8- didehydrodeepoxy-salmoxanthin from salmon Oncorhynchus keta. J Nat Prod 64(4): 507-510.
  35. Gouveia L, Rema P, Pereira O, Empis J (2003) Colouring ornamental fish (Cyprinus carpio and Carassius auratus) with micro algal. Aquaculture Nutrition 9(2): 123-129.
  36. Ezhil J, Jeyanthi C, Narayanan M (2008) Effect of formulated pigmented feed on colour changes and growth of red swordtail, Xiphophorus helleri. Turkish Journal of Fisheries and Aquatic Sciences 8(1): 99-101.
  37. Schiedt K (1998) Absorption and metabolism of carotenoids in birds, fish and crustaceans. In: Carotenoids Biosynthesis and Metabolism. Britton GS & Pfander H (Eds.), Birkhäuser: Basel, Switzerland, pp. 285-358.
  38. Huyghebaert G (1993) The utilisation of oxy-carotenoids for egg yolk pigmentation. Thesis of the Univiversity of Gent (Belgium).
  39. Seemann M (1997) Eidotterpigmentierung: Unterschiede bei natürlichen und synthetischen Carotinoiden? DGS Magazin 49(36): 24-28.
  40. Grashorn MA, Steinberg W, Blanch A (2000) Effects of canthaxanthin and saponified capsanthin/capsorubin in layer diets on yolk pigmentation in fresh and boiled eggs. XXI World’s Poultry Congress, Canada, 20-24.
  41. Andrewes AG, Starr MP (1976) (3R, 3´R)-astaxanthin from the yeast Phaffia rhodozyma. Phytochemistry, 15(6): 1009-1011.
  42. Ako H, Tamaru CS, Asano L, Yamamoto (2000) Achieving natural coloration in fish finder culture. In: Spawning and maturation of aquaculture species, Proceeding of the 28th UNJR aquaculture panel symposium, Kihei, Hawaii. 10-12, U NJR Tech Rep, 28: 1-4.
Creative Commons Attribution License

©2016 , et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.