American Journal of Theoretical and Applied Statistics
Volume 5, Issue 2-1, March 2016, Pages: 40-48

A Novel Approach to Finding Sampling Distributions for Truncated Laws Via Unbiasedness Equivalence Principle

Nicholas A. Nechval1, *, Sergey Prisyazhnyuk2, Vladimir F. Strelchonok1

1Department of Mathematics, Baltic International Academy, Riga, Latvia

2Department of Geoinformation Systems, National Research University of Information Technologies, Mechanics and Optics, St-Petersburg, Russia

(N. A. Nechval)
(S. Prisyazhnyuk)
(V. F. Strelchonok)

Nicholas A. Nechval, Sergey Prisyazhnyuk, Vladimir F. Strelchonok. A Novel Approach to Finding Sampling Distributions for Truncated Laws Via Unbiasedness Equivalence Principle. American Journal of Theoretical and Applied Statistics. Special Issue: Novel Ideas for Efficient Optimization of Statistical Decisions and Predictive Inferences under Parametric Uncertainty of Underlying Models with Applications. Vol. 5, No. 2-1, 2016, pp. 40-48. doi: 10.11648/j.ajtas.s.2016050201.16

Abstract: Truncated distributions arise naturally in many practical situations. In this paper, the problem of finding sampling distributions for truncated laws is considered. This problem concerns the very important area of information processing in Industrial Engineering. It remains today perhaps the most difficult and important of all the problems of mathematical statistics that require considerable efforts and great skill for investigation. In a given problem, most would prefer to find a sampling distribution for truncated law by the simplest method available. For many situations encountered in textbooks and in the literature, the approach discussed here is simple and straightforward. It is based on use of the unbiasedness equivalence principle (UEP) that represents a new idea which often allows one to provide a neat method for finding sampling distributions for truncated laws. It avoids explicit integration over the sample space and the attendant Jacobian but at the expense of verifying completeness of the recognized family of densities. Fortunately, general results on completeness obviate the need for this verification in many problems involving exponential families. The proposed approach allows one to obtain results for truncated laws via the results obtained for non-truncated laws. It is much simpler than the known approaches. In many situations this approach allows one to find the results for truncated laws with known truncation points and to estimate system reliability in a simple way. The approach can also be used to find the sampling distribution for truncated law when some or all of its truncation parameters are left unspecified. The illustrative examples are given.

Keywords: Truncated Law, Unbiasedness Equivalence Principle, Sampling Distribution, Reliability Estimation

1. Introduction

A probability distribution for a random variable X is said to be truncated when some set of values in the range of X is excluded. The truncated distributions (left truncated, right truncated or the doubly truncated) have found many applications, particularly in numerous industrial settings [1-8]. Final products are often subject to screening inspection before being sent to the customer. The usual practice is that if a product’s performance falls within certain tolerance limits, it is judged conforming and sent to the customer. If it fails, a product is rejected and thus scrapped or reworked. In this case, the actual distribution to the customer is truncated. Another example can be found in a multistage production process, in which inspection is performed at each production stage. If only conforming items are passed on to the next stage, the actual distribution is a truncated distribution. Accelerated life testing with samples censored is also a good example. In fact, the concept of a truncated distribution plays a significant role in analyzing a variety of production processes, process optimization and quality improvement. Truncated distributions can also be used to model intensity statistics in the study of atomic heterogeneity [9]. The justification being that: 1) atomic heterogeneity led to the intensity statistics being modified from Gaussian to near Gaussian forms [10,11]; and 2) in reality, the structure factors or normalized structure factors do not range from -∞ to ∞ but over a finite range.

Several examples have been given employing the truncated distributions in fitting rainfall data and animal population studies where observations usually begin after migration has commenced or concluded before it has stopped [12,13]. Other examples arise in life testing and reliability problems, where if failure is caused by a wear-out mechanism or is a consequence of accumulated wear, then the length-of-life of a system can be expected to be of finite dimension.

In many areas of the sciences, in particular communication networks, economic, hydrology, material science and Physics, long-tailed distributions arise. For example, many traffic measurement studies in modern communication networks such as the Internet have found long-tailed distributions. This means that the behavior of these data significantly departs from the traditional telephone traffic and its related Markov models with short-range dependence. In particular, the common Poisson arrival process and corresponding analysis based on Erlang formula are no longer valid.

The main weakness of long-tailed distributions is that they do not have finite moments of all orders. This weakness has restricted their use. To overcome this weakness, Nadarajah [14] introduces truncated versions of five of the most commonly known long-tailed distributions—which possess finite moments of all orders and could therefore be better models.

The object of the present paper is to obtain a sampling distribution for truncated law with a known (or unknown) truncation point (in general, vector) and a minimum variance unbiased estimator of the reliability function for this model using the results obtained for non-truncated law. It is known that a sampling distribution for truncated law may be derived using, namely, the method based on characteristic functions [15], the method based on generating functions [16], or the combinatorial method [17]. In this paper, a much simpler technique than the above ones is proposed. It allows one to obtain the results for truncated laws more easily.

2. Unbiasedness Equivalence Principle

Suppose an experiment yields data sample Xn = (X1, … , Xn) relevant to the value of a parameter θ (in general, vector). Let LX(xn|θ) denote the probability or probability density of Xn when the parameter assumes the value θ. Considered as a function of θ for given Xn=xn, LX(xn|θ) is the likelihood function. If the data sample Xn can be summarized by a sufficient statistic S (in general, vector), one can write LS(s|θ) µ LX(xn|θ). Further, for any non-negative function w(s), w(s)LS(s|θ) is also a likelihood function equivalent to LX(xn|θ). Suppose we recognize a function w(s) such that w(s)LS(s|θ), regarded as a function of s for a given θ, is a density function. It can be shown that this is the sampling density of S if the family of recognized densities is complete.

The unbiasedness equivalence principle [18] consists in the following. If

(1)

represents the likelihood function for the truncated law, where w(θ,ϑ) is some function of a parameter (θ,ϑ) associated with truncation, ϑ is a known truncation point (in general, vector), then a sampling density for the truncated law is determined by

(2)

where

(3)

g(s|θ) is a sampling density of a sufficient statistic s(Xn) (for a family of densities {f(x|θ)}) determined on the basis of LX(Xn|θ), is an unbiased estimator of 1/[w(θ,ϑ)]nwith respect to g(s|θ), sS (a sample space of a non-truncated sufficient statistic S), φ(S) is a function of S for a given θ, which is equivalent to unbiased estimator  of 1/[w(θ,ϑ)]n, i.e.,

(4)

or

(5)

gϑ (s|θ) is the sampling density of a sufficient statistic S (for a family of densities {fϑ (x|θ)}) when the truncation parameter ϑ is known, Sϑ is a sample space of a truncated sufficient statistic S.

3. Finding Sampling Distributions for Truncated Laws with Known Truncation Points

3.1. Example 3.1

Sampling distribution for the left-truncated Poisson law. Let the Poisson probability function be denoted by

(6)

The probability function of the restricted random variable, which is truncated away from some ϑ 0, is then

(7)

where

(8)

Consider a sample of n independent observations X1, X2, …, Xn, each with probability density function fϑ (x|θ), where the likelihood function is defined as

(9)

and let

(10)

It is well known that

(11)

is a complete sufficient statistic for the family {f(x|θ)}. A result of [19] states that sufficiency is preserved under truncation away from any Borel set in the range of X. Hence, in the case at hand S is sufficient for {fϑ (x|θ)}. It can be verified that S is also complete.

For the sake of simplicity but without loss of generality, consider the case ϑ=0. This is at the same time the most important case for applications and the easiest with which to deal. It follows from (2) that

(12)

where

(13)

(14)

(15)

denotes the Stirling number of the second kind [20] defined by

(16)

(17)

This is the same result that of Tate and Goen [21]. Their proof was based on characteristic functions.

3.2. Example 3.2

Sampling distribution for the right-truncated exponential law. Let the probability density function of the right-truncated exponential distribution be denoted by

(18)

where

(19)

(20)

Consider a sample of n independent observations X1, X2, …, Xn, each with density fϑ (x|θ), where the likelihood function is determined as

(21)

It is well known that

(22)

is a complete sufficient statistic for the family {f(x|θ)}. It follows from (2) that

n 1, (23)

where a+= max(0, a),

(24)

(25)

(26)

(27)

This is the same result that of Bain and Weeks [15]. Their proof was based on characteristic functions.

3.3. Example 3.3

Sampling distribution for the doubly truncated exponential law. Consider an exponential distribution (20) that is doubly truncated at a lower truncation point (ϑ1) and an upper truncation point (ϑ2). The probability density function of the doubly truncated exponential distribution is defined as

(28)

where ϑ = (ϑ1,ϑ2),

(29)

Consider a sample of n independent observations X1, X2, …, Xn, each with density fϑ (x|θ), where the likelihood function is determined as

(30)

It is well known that

(31)

is a complete sufficient statistic for the family {f(x|θ)}. It follows from (2) that

n 1, (32)

where a+ = max(0, a), g(s|θ) is given by (24),

(33)

(34)

(35)

4. Validity of the Unbiasedness Equivalence Principle

The theoretical results of this investigation into the validity of the proposed unbiasedness equivalence principle (UEP) for finding sampling distributions for truncated laws are largely contained in the theorem given below. We introduce the following notation and assumptions. Let Xn be a random variable taking on values xn in a space Xϑ, let A be a s -field of subsets of Xϑ, and let (θ, ϑ) be a parameter associated with truncation, where ϑ is a known truncation point. For all values of the parameter θ in some parameter space Θ, let Pϑ be a probability measure on A; i.e., for any set A in A, Pϑ (A|θ) is the probability that Xn will belong to A when the parameter has the value θ. Let S = s(Xn) be a statistic on the measurable space (Xϑ, A) taking on values in a measurable space (Sϑ, B). For each θΘ, let Gϑ be the probability distribution of S when Xn has the distribution Pϑ, i.e., for any BB , Gϑ (B|θ) = Pϑ ( where s-1(B) is the set of points xn in Xϑ for which s(xn)B.

(i).       Assume the family P={Pϑ:θΘ} of probability distributions of Xn is dominated by a totally s-finite measure m over (Xϑ, A), i.e., there exists, for all θ Θ, a non-negative A - measurable function pϑ (xn|θ) such that

(36)

for all AA. (The integrand pϑ (xn|θ) is called the density of Pϑ w.r.t. (with respect to) m).

(ii).      Assume that s(Xn) is sufficient for P. From the Halmos-Savage factorization theorem [22], s(Xn) is sufficient if and only if for each θΘ there exists a non-negative B- measurable function LS(s(xn)|θ,ϑ) on S ϑ and a non-negative A - measurable function v on Xϑ such that

(37)

(The symbol (m) following a statement means that the statement holds except on a set of m - measure zero). In (37), we will assume that LS and v are finite (m).

(iii).    Assume we recognize some likelihood function LS(s|θ,ϑ) equivalent to likelihood function LX(xn|θ,ϑ). Define a s -finite measure r over (Xϑ, A) by

(38)

Then, from (36), (37), and (38),

(39)

(iv).    Assume we recognize a totally s -finite measure h over (Sϑ, B) such that the measure r s-1 over (Sϑ, B) is absolutely continuous w.r.t. h; i.e., h(B)=0 implis that rs-1(B) = 0, where r s-1(B) denotes the r - measure of the inverse image of B.

(v).     Assume we recognize a positive B-measurable function φ on Sϑ such that

(40)

for all θΘ. Assume further that for any measurable set B of positive h - measure, there exists a θΘ and a measurable subset B1 of B of positive h - measure over which LS(s|θ,ϑ)φ(s) is positive.

From (40), {LS(s|θ,ϑ)φ(s):θΘ} is a family of densities w.r.t. h. For BB, let

(41)

Thus, (v) provides us with a family of densities, but at this stage we do not know if this recognized family is the family of sampling densities of S.

(vi).    (vi) Assume we recognize that the family {LS(s|θ,ϑ)φ(s):θΘ} is complete, i.e.,

(42)

implies

(43)

except on a set D with  for all θΘ.

Theorem 1 (Sampling distribution for truncated law). Under assumptions (i) through (vi), Gϑ has a density with respect to h and LS(s|θ,ϑ)φ(s) is a version of it, i.e.,

(44)

is the sampling density, gϑ (s|θ), of the sufficient statistic s(Xn).

Proof. We show first that (43) and the second part of (v) imply that f (s)º0 (h). For suppose there exists a measurable В with h(B)>0 and f(s)¹0 over B. Then BÌD, so Gϑ (B|θ)=0 for all θΘ. But, from (v), there exists a B1ÌB for which Gϑ(B1|θ)>0 for some θ, contradict­ing Gϑ (B|θ)=0 for all θΘ. Now, by a theorem in [22], there exists a non-negative measurable function y on Sϑ such that

(45)

for every measurable function Θϑ, in the sense that if either integral exists, then so does the other and the two are equal.

In (45), let Θϑ (s,θ)=cBLS(s|θ,ϑ), where cB is the characteristic function of B (BB). Then there exists a y (s) such that

(46)

for all BB. Note that the left side of (46) is Gϑ (B|θ).

In (42), let f (s) = 1-[y(s)/φ (s)]. From (40) and (46),

(47)

for all θΘ. Thus, from (43), y(s)=φ(s) almost everywhere (h), and, from (47),

(48)

is a version of the density of Gϑ with respect to h.

5. Finding Reliability Estimators for Truncated Laws

Consider a system that is required to operate for a given ‘mission time’, t. The reliability of this system for the right-truncated distribution of time-to-failure with the probability density function fϑ (x|θ) may be defined as

(49)

Due to the Rao-Blackwell and Lehmann-Scheffé theorem [23] a minimum variance unbiased (MVU) estimator for R may be obtained as

(50)

where X may be any one of the observations (X1, …, Xn) from fϑ (x|θ), S is a complete sufficient statistic for {fϑ (x|θ)}, and fϑ(x|s) is the conditional distribution of X given S=s; fϑ (x|s) is obtained as

(51)

where

(52)

is the joint probability density of X and S,  is an unbiased estimator of

(53)

with respect to g(s|θ).

It should be noted that (50) can be obtained by different method as

(54)

where  is an unbiased estimator of

(55)

with respect to g(s|θ).

5.1. Example 5.1

MVU estimator of reliability for the right truncated exponential distribution. Let Xn=(X1, …, Xn) be a random sample of size n from a population with density (18). Then it follows from (50) (or (54)) that the MVU estimator of R(t) is obtained as

(56)

As a particular case, if ϑ ® that is the variable X is assumed unrestricted, the corresponding MVU estimator of reliability reduces to

(57)

For instance, suppose that the following failure times, in hours, are available from a given system: 4.2, 9.8, 16, 20 and that the truncation point ϑ=25 hours and the mission time t=5 hours. Clearly s=50 hours. Substituting these values in (56), the estimate of reliability is obtained as  Had we assumed, however, that the observations are coming from the complete population, the estimate of reliability would have been, from (57),

5.2. Example 5.2

MVU estimator of reliability for the right-truncated gamma distribution. Let Xn=(X1, …, Xn) be a random sample of size n from a population with density

0 < x ϑ, σ > 0, δ > 0, (58)

where ϑ is point of truncation, θ=(s,d), and w(θ,ϑ) is such that

(59)

This distribution has found applications in a number of diverse fields, for instance, in fitting of length-of-life data under fatigue. Note that for d=1, the right-truncated gamma distribution reduces to the right-truncated exponential distribution with parameter s. Although, this distribution is a special case of gamma distribution and gives a good fit to length-of-life data in many situations, it is not suitable since its use carries the implication that at any time future life-length is independent of past history.

To find MVU estimator of R(t) we apply the above technique. If the shape parameter d in (58) is assumed to be known, then it is well known that

(60)

is a complete sufficient statistic for s. The probability density function of the sampling distribution of S is given by

s(0, nϑ),                  (61)

where

(62)

(63)

The joint distribution of X and S is given by

(64)

Thus the conditional distribution of X given S is

(65)

Hence the MVU estimator of R(t) at time t is given by

(66)

It may be remarked that the result (66) though at the first look appears quite unwieldy is not so in practical applications, particularly when the sample size is small.

As a particular case, if ϑ ® that is the random variable X is assumed unrestricted, the distribution of the sufficient statistics from equation (61) reduces to

s(0,) (67)

and the corresponding MVU estimator of reliability at time t is given by

(68)

which corresponds to Basu’s [24] equation (9).

6. Finding Sampling Distributions for Truncated Laws with Unknown Truncation Points

It will be noted that the proposed approach can also be used to find the sampling distribution for truncated law when some or all of its truncation parameters are left unspecified.

6.1. Example 3.3 (Continued)

For instance, consider a situation of Example 3.3 where it is assumed that the truncation parameter ϑ=(ϑ1,ϑ2) is unknown. It is known that the statistic (X(1), X(n), S), where

(69)

(70)

and

(71)

is a complete sufficient statistic for a set of parameters (ϑ1,ϑ2,θ). In this case, the likelihood function of a sample is determined as

(72)

where ϑ = (ϑ1,ϑ2),

(73)

is the joint probability density function of the order statistics and, Fϑ (×) is the probability distribution function. It is well known that

(74)

is a complete sufficient statistic for the family {f(x|θ)}. It follows from (2) and (72) that

s[(n-2), (n-2)], n 3,(75)

where

(76)

(77)

(78)

(79)

Thus, the sampling distribution of the sufficient statistic (X(1), X(n), S) for (ϑ1,ϑ2,θ) is given by

(80)

In other words, we have the following results.

6.2. Truncation Cases

In the case of one-sided truncation, when a truncation point on the left, ϑ1, is unknown, a sampling distribution of the sufficient statistic (X(1), S) for (ϑ1,θ) is given by

(81)

where

xi ϑ1, i = 1, …, n, (82)

(83)

is the probability density function of the order statistic X(1),

(84)

ss(X2, …, Xn).

In the case of one-sided truncation, when a truncation point on the right, ϑ2, is unknown, a sampling distribution of the sufficient statistic (X(n), S) for (ϑ2,θ) is given by

(85)

where

xi £ ϑ2, i = 1, …, n, (86)

(87)

is the probability density function of the order statistic X(n),

(88)

sºs(X1, …, Xn-1).

In the case of two-sided truncation, when a lower truncation point, ϑ1, and an upper truncation point, ϑ2, are unknown, a sampling distribution of the sufficient statistic (X(1), X(n), S) for (ϑ1,ϑ2,θ ) is given by

(89)

where

ϑ1 £ xi £ ϑ2, i = 1, …, n,            (90)

(91)

is the joint probability density function of the order statistic X(1) and X(n),

(92)

sºs(X2, …, Xn-1).

6.3. Example 6.3

If, say, we deal with a left-truncated exponential distribution,

(93)

where

(94)

and a truncation point on the left, ϑ1, is unknown, then it follows immediately from (81) that the sampling distribution of the sufficient statistic (X(1), S=X2 + … +Xn) for (ϑ1,θ) is given by

(95)

which corresponds to the well-known result [23].

7. Conclusion

The authors hope that this work will stimulate further investigations using the proposed approach on specific applications to see whether obtained results with it are feasible for realistic applications.

References

1. B. R. Cho and M. S. Govindaluri, "Optimal screening limits in multi-stage assemblies," International Journal Production Research, vol. vol. 40, pp. 1993–2009, 2002.
2. A. Jeang, "An approach of tolerance design for quality improvement and cost reduction," International Journal Production Research, vol. 35, pp. 1193–1211, 1997.
3. K. C. Kapur and B. R. Cho, "Economic design and development of specification," Quality Engineering, vol. 6, pp. 401–417, 1994.
4. K. C. Kapur and B. R. Cho, "Economic Design of the Specification Region for Multiple Quality Characteristics," IIE Transactions, vol. 28, pp. 237–248, 1996.
5. M. D. Phillips and B. R. Cho, "Quality improvement for processes with circular and spherical specification region," Quality Engineering, vol. 11, pp. 235–243, 1998.
6. M. D. Phillips and B. R. Cho, "Modeling of Optimum Specification Regions," Applied Mathematical Modelling, vol. 24, pp. 327–341, 2000.
7. M. T. Khasawneh, S. R. Bowling, S. Kaewkuekool, and B. R. Cho BR (2004). "Tables of a truncated standard normal distribution: a singly truncated case," Quality Engineering, vol. 17, pp. 33–50, 2004.
8. M. T. Khasawneh, S. R. Bowling, S. Kaewkuekool, and B. R. Cho, "Tables of a truncated standard normal distribution: a doubly truncated case," Quality Engineering, vol. 18, pp. 227–241, 2005.
9. K. Bhowmick, A. Mukhopadhyay, and G. B. Mitra, "Edgeworth series expansion of the truncated cauchy function and its effectiveness in the study of atomic heterogeneity,"Zeitschrift fur Kristallographie, vol. 215, pp. 718–726, 2000.
10. U. Shmueli, "Symmetry and composition dependent cumulative distribution of the normalized structure amplitude for use in intensity statistics," Acta Crystallography, vol. A35, pp. 282–286, 1979.
11. U. Shmueli, A. J. Wilson, "Effects of space group symmetry and atomic heterogeneity on intensity statistics," Acta Crystallography, vol. A37, pp. 342–353, 1981.
12. D. G. Chapman, "Estimating the parameters of a truncated gamma distribution," Ann. Math. Statist. vol. 27, pp. 498-506, 1956.
13. K. W. Kenyon, V. B. Scheffer, and D. G. Chapman, "A Population Study of the Alaska Fur Seal Herd," U.S. Wildlife, vol. 12, pp. 1-77, 1954.
14. S. Nadarajah, "Some Truncated Distributions," Acta Appl. Math., vol. 106, pp. 105-123, 2009.
15. L. J. Bain and D. L. Weeks, "A note on the truncated exponential distribution," Ann. Math. Statist., vol. 35, pp. 1366-1367, 1964.
16. C. A. Charalambides, "Minimum variance unbiased estimation for a class of left-truncated discrete distributions," Sankhyā, vol. 36, pp. 397-418, 1974.
17. T. Cacoullos, "A combinatorial derivation of the distribution of the truncated poisson sufficient statistic," Ann. Math. Statist., vol. 32, pp. 904-905, 1961.
18. N. A. Nechval, K. N. Nechval, G. Berzins, and M. Purgailis, "Unbiasedness equivalence principle and its applications to finding sampling distributions for truncated laws," in Proceedings of the Second International Conference on Mathematics: Trends and Developments, Vol. I. Cairo: The Egyptian Mathematical Society, 2007, pp. 165-180.
19. J. W. Tukey, "Sufficiency, truncation and selection," Ann. Math. Statist., vol. 20, pp. 309-311, 1949.
20. C. Jordan, Calculus of Finite Differences. New York: Chelsea, 1950.
21. R. F. Tate and R. L. Goen, "Minimum Variance Unbiased Estimation for the Truncated Poisson Distribution," Ann. Math. Statist., vol. 29, pp. 755-765, 1958.
22. P. R. Halmos, Measure Theory. New York: Van Nostrand, Inc., 1950.
23. S. Zacks, The Theory of Statistical Inference. New York: John Wiley & Sons, Inc., 1971.
24. A. P. Basu, "Estimation of reliability for some distributions useful in life testing," Technometrics, vol. 6, pp. 215-219, 1964.

 Contents 1. 2. 3. 3.1. 3.2. 3.3. 4. 5. 5.1. 5.2. 6. 6.1. 6.2. 6.3. 7.
Article Tools
PUBLICATION SERVICE
RESOURCES
SPECIAL SERVICES
Science Publishing Group
548 FASHION AVENUE
NEW YORK, NY 10018
U.S.A.
Tel: (001)347-688-8931