1. Introduction
Moment generating function (MGF) plays an important role in statistical distribution theory. Its derivatives evaluated at zero yield the moments of the considered distribution. Information generating (IG) functions have also been used in information theory, in addition to the moment generating function, to generate some well-known information measures such as Shannon entropy and Kullback–Leibler divergence.
The IG function of a probability model
f was first introduced by Golomb [
1], whose first derivative evaluated at one provides Shannon entropy for that probability model.
Suppose the variable
X has an absolutely continuous probability density function (PDF)
f. Then, the IG function of density
for any
is defined as
when the integral is finite. In order to simplify the notation, we do not use
in the integration with respect to
throughout the article, unless a distinction needs to be made. The following properties of
in (
1) have been stated in Golomb [
1]:
where
is the Shannon entropy defined as
In particular, when
, the IG measure is simply
, known as informational energy (IE) function. The IG function and its extensions have been used extensively in chemistry and physics to discuss the atomic structure of a given phenomena or system; for more details, one may see López-Ruiz et al. [
2]. In addition, the IG function, known as entropic moment in chemistry and physics literature, plays a key role in chaos theory and non-extensive thermodynamics. Note that the IG function is closely linked to Tsallis and Rényi entropies. The entropic moment measure, as well as the information entropy, reflect on the degree of spread of a probabilistic model, see Bercher [
3].
Recently, Clark [
4] has presented an analogous IG function for stochastic processes to assist in the derivation of information measures for point processes.
Guiasu and Reischer [
5] proposed relative information generating (RIG) function between two density functions, whose first derivative evaluated at 1, yields Kullback–Leibler (KL) divergence (Kullback and Leibler, [
6]) measure.
Suppose the variables
X and
Y have absolutely continuous density functions
f and
g, respectively. Then, the RIG function, for any
, is defined as
when the integral is finite. The KL divergence is then obtained, from its first derivative, as
One may refer to Clark [
4] and Mares et al. [
7] for some discussions on the usefulness and applications of the RIG function
The main objective of this paper is to study the IG and RIG information measures associated with ranked set sampling (RSS) schemes. The analysis of information content in various sampling strategies is of great importance in sampling theory. In this regard, information theory provides specifically a framework for the quantification of information content in a given source with a probabilistic structure under different sampling strategies. Among various strategies discussed in sampling theory, we focus here on some well-known strategies that are known to be efficient. A cost-effective survey sampling method, known as ranked set sampling (RSS), was first introduced by McIntyre [
8]. He specifically introduced RSS to estimate the mean of a population based on a given simple random sample (SRS) of size
n and observed that the estimator based on RSS is an unbiased estimator with a smaller variance as compared to the mean of a SRS. The RSS and some of its generalizations have been discussed rather extensively in the literature. For example, Frey [
9]; Park and Lim [
10]; and Chen, Bai, and Sinha [
11] have all discussed the information content in RSS based on Fisher entropy, while Tahmasebi et al. [
12] have studied the Tsallis entropy based on maximum RSS scheme. Therefore, considering the importance of this issue and the connection between information theory and ranked set sampling theory, a systematic study of the IG function as generator function of some well-known information measures, in the framework of RSS strategy, seems to be necessary. This forms the primary motivation for the present study.
We now briefly introduce SRS and RSS strategies that will be used in the sequel. Let
X be an absolutely continuous random variable with PDF
f. Then, a SRS of size
n, derived from the random variable
X, is denoted by
Further, suppose a random sample of size
is selected and is randomly divided into
n groups of equal size
n. Then, a one-cycle RSS is observed in the following manner:
As we see from the above representation, the recorded sample in each group of SRS with size
n corresponds to the
ith order statistic. Thus, the RSS vector of observations is given by
, where
is the
ith order statistic based on a given SRS of size
n with PDF
f and cumulative distribution function (CDF)
F. Then, the PDF of
is known to be
Here,
corresponds to the
ith order statistic, and with that taking the value
x, there will be
observations less than
x each with probability
and
observations greater than
x each with probability
. For pertinent details, one may refer to the authoritative book on this subject by Arnold et al. [
13].
Maximum and minimum ranked set sampling schemes are two useful modifications of ranked set sampling procedure. A maximum RSS is given by
, where
is the largest order statistic based on a SRS of size
i from
f. Similarly, a minimum RSS is given by
, where
is the smallest order statistic based on a SRS of size
i from
f. From (
5), the PDF of
is given by
where
is the survival function of
X. Similarly, the PDF of
is given by
The corresponding CDFs of (
6) and (
7) are given by
and
, respectively.
The purpose of this work is twofold. The first part is to derive IG measures for the SRS and RSS, and especially in maximum and minimum RSS frameworks, and provide some comparison results associated with IG measures of these observations based on dispersive stochastic ordering. In the second part, we further study the RIG divergence measure between SRS and RSS, and specifically the RIG divergence measure between minimum and maximum RSS procedures.
The rest of this paper is organized as follows. In
Section 2, we consider the information generating function and establish some results for SRS and RSS procedures. We show that the IG measures of SRS and RSS can be expressed based on different orders of fractional Shannon entropy. Moreover, we examine the monotonicity properties of IG measure for vectors
and
based on a sample of size
n, under a mild condition. In
Section 3, we discuss the comparison of information generating functions for SRS and RSS frameworks in terms of dispersive stochastic ordering. Next, in
Section 4, we study the RIG measures for vectors
,
and
. Finally, we make some concluding remarks in
Section 5.
4. RIG Divergence Measure Based on RSS Scheme
Let
denote a SRS of size
n from density function (PDF)
f and cumulative distribution function
F. Further, let
,
and
be the corresponding RSS, minimum RSS and maximum RSS vectors, respectively. We now consider the RIG measure between variable
X and each of the vectors
and
. From the definition of RIG measure in (
3), the RIG divergence between
with density in (
6) and
X is given by
Similarly, the RIG divergence between
with density in (
7) and
X is given by
It is evident from the above results that which is free of the underling distribution F.
Theorem 8. Consider the vectors and from density function f. Then, we have:
- (i)
;
- (ii)
where
Proof. From the definition of RIG divergence between vectors
and
, we find
which proves Part (i). Part (ii) can be proved in an analogous manner. □
With the result that
in Theorem 8, we have plotted the RIG measure between vectors
and
, for some selected choices of
and sample size
n, in
Figure 2. From
Figure 2, it is easy to observe that for
, the RIG divergence measure between
and
is decreasing with respect to sample size
n (Panels (a) and (b)), while for
, the considered RIG measure is increasing with respect to sample size
n (Panels (c) and (d)). Therefore, for
, the similarity between the density functions of the considered sampling vectors
and
gets increased. For
, the result is the opposite, i.e., the similarity between the two sampling vectors gets decreased.
Theorem 9. Consider the vectors and from density function f. Then, we have:
- (i)
;
- (ii)
where .
Proof. From the definition of RIG measure between vectors
and
, we have
which proves Part (i). Part (ii) can be proved in a similar manner. □
We have plotted the results of Theorem 9 in
Figure 3 and
Figure 4 for some choices of
. From these figures, we observe that for
, both RIG measures in Theorem 9 are deceasing with respect to sample size
n. Therefore, the similarity between the density functions of the considered sampling vectors
and
gets increased with increasing sample size
n.