Research Article Volume 9 Issue 6
Doutoranda em Design, Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio), Brazil
Correspondence: Evânia de Paula Muniz, Doutoranda em Design, Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio), Rua Marquês de São Vicente, 225, Gávea, Rio de Janeiro – RJ, Brazil, Tel (21) 92017-1476
Received: November 28, 2024 | Published: December 10, 2024
Citation: Muniz EP. From garbage a flower is born. MOJ Eco Environ Sci. 2024;9(6):272-276. DOI: 10.15406/mojes.2024.09.00335
We introduce a Bayesian probability model for making inferences about the unknown number of individuals in a sample, based on known sample weight and on information provided by subsamples with known weights and corresponding counts. Inherent in the Bayesian approach, the model allows for an incorporation of prior information that is often available about the sample size and other uncertain parameter values. As a result, the model provides an estimate of the number of individuals in the sample in the form of a posterior probability distribution that includes both the prior information and the interpretation of the observed data. Such a result cannot be obtained using the frequentist approach. The model presented here can be applied to a wide range of similar problems. Here our main focus is stock assessment, where the task is the conversion of the catch weight into the number of individuals in the catch. The model is easy to use due to availability of general purpose MCMC simulation software, and it can be used either in a standalone fashion or embedded into more complex probability models.
Keywords:Catch samples, Markov chain monte carlo (MCMC), Sample size, Weight distribution
SAT, Surface Air Temperature; SSS, Sea Surface Salinity; SST, Sea surface temperature; RH, Relative Humidity; DO, Dissolved Oxygen; BOB, Bay of Bengal; IO, Indian Ocean
Counting the number of individuals in a large sample can be very laborious or impractical. Instead of exact counting of all individuals, the size of a large sample can be estimated by weighing it and then using information about the mean weight of individuals obtained from smaller samples. Estimation approaches based on this general idea have a number of natural applications in aquatic sciences. For instance, the number of fish in a commercial catch is often estimated in this way in order to obtain data that would be suitable for typical stock assessment methods.1,2 Further, the approach is apparently commonly used in the estimation of the number of fish raised in fish hatcheries for subsequent stocking.3 Besides estimating the number of fish, the approach has been applied in e.g., estimating the number of eggs in fish gonads.4 However, estimates obtained this way will always involve an element of uncertainty unless all individuals in the sampled population are of exactly equal weight, and unless measurement error is negligible. For a systematic analysis, this uncertainty should be taken into account in all further considerations utilizing these estimates. In fisheries science such uncertainty has been often neglected, or frequentist methods have been applied. Unfortunately frequentist methods cannot provide measurements of uncertainty about parameters, even though the results of the frequentist analysis are often phrased in a way which invites such a misinterpretation.
A sample containing N fish can be obtained from a fish population either by simple random sampling or by some form of selective sampling. In a typical case N is large, but the theory presented here holds in principle for any positive integer value. On the other hand, the population from which the sample is drawn is assumed to be infinitely large. For example, fish living in a particular rearing pond are seen as a sample from the potentially infinitely large population of similar fish that could be produced in the given pond and under conditions similar to present ones.
We begin by making the assumption that the weights (wi; i=1,…,N) of individual fish are exchangeable for all values of N. This assumption means in particular that the joint predictive distribution of the weights, describing beliefs about the weights of the fish in the sample, is always the same regardless which particular N fish would have been sampled, and how they would be ordered within the sample. According to the celebrated representation theorem of de Finetti, the assumption of exchangeability allows us to write the joint predictive distribution (density) in the form
Why Bayesian?
The model presented in this paper endeavors to answer a simple question: “Given my past experience and samples obtained, what should I think about the number of individuals in the large sample?”. Basing on the idea to use the concept of probability as a measure of personal degree of belief, the Bayesian approach is capable of answering such a question. All we need to do is to formalise in terms of probability what our past experience says about the distribution of the weight and about the number of individuals. The update of beliefs is obtained by applying the rules of probability calculus, and a quantitative answer to the original question is obtained in the form of the posterior distribution for the number of individuals in the large sample, providing an updated degree of belief in each possible value of the number of individuals. The frequentist approach, however, cannot provide a quantitative answer to the question. It is well known by statisticians, but not equally well appreciated by many applied scientists, that the frequentist approach deals only with the conditional distribution of observations given that the parameter values were known. The question for which the frequentist approach does provide a formal answer could then be stated as: “Given my past experience and a conjectured number of individuals in the large sample, what kind of samples could I expect to see if I repeatedly sampled the population for a very large number of times”. This question is quite different from the direct question concerning the unknown correct number of individuals in the sample. However, these two questions are obviously related, as it makes sense to believe more in numbers that would lead to data like those observed more frequently than in numbers that would make the observed data look more rare under the assumed sampling distribution. Thus, the result of the frequentist analysis can be intuitively connected to the question of actual interest, but the idea of direct probabilistic inference about the unknown number of individuals is lost.
The Gamma distribution is parameterized in this paper in terms of the mean and the standard deviation. The probability density function of a Gamma distributed variable x is
None.
None.
©2024 Muniz. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.