Stratified sampling technique

The stratified sampling technique is a well-established area in statistical sampling (Cochran, 1966). Variance reduction by the stratified sampling technique is achieved by taking more samples in important subregions. Consider a problem in which the expectation of a function g (X) is sought, where X is a random variable with a PDF fx(x), x є E. Referring to Fig. 6.13, the domain E for the random variable X is divided into M disjoint subregions Em, m = 1, 2,…, M. That is,

S = U Sm 0 = Sm Fl Sm! m = m

m=1

Stratified sampling technique

Let pm be the probability that random variable X will fall within the subregion Em, that is, f fx(x)dx = pm. Therefore, it is true that Lmpm = 1.

The expectation of g (X) can be computed as

m.. M

G = g(x) fx(x) dx =^2 / g(x) fx(x) dx =^2 Gm (6.86)

m=1J Em m=1

where Gm = fw g (x) fx(x) dx.

Note that the integral for Gm can be written as

(6.87)

(6.88)

Stratified sampling technique

where nm is the number of sample points in the mth subregion, and Lmnm = n, the total number of random variates to be generated. Therefore, the estimator for G in Eq. (6.86) can be obtained as

After the number of subregions M and the total number of samples n are determined, an interesting issue for the stratified sampling is how to allocate the total n sample points among the M subregions such that the variance associated with G by Eq. (6.89) is minimized. A theorem shows that the optimal n*m that minimizes Var(G) in Eq. (6.89) is (Rubinstein, 1981)

Подпись: (6.90) Подпись: nm=n pm&m

v^M p _

Xm’ = 1 P m’ °m!

where am is the standard deviation associated with the estimator Gm in Eq. (6.88).

In general, information about am is not available in advance. It is suggested that a pilot simulation study be made to obtain a rough estimation about the value of am, which serves as the basis in the follow-up simulation investigation to achieve the variance-reduction objective.

A simple plan for sample allocation is nm = npm after the subregions are specified. It can be shown that with this sampling plan, the variance associated with G by Eq. (6.89) is less than that from the simple random-sample technique. One efficient stratified sampling technique is systematic sampling (McGrath, 1970), in which pm = 1/M and nm = n/M. The algorithm of the systematic sampling can be described as follows:

1. Divide interval [0, 1] into M equal subintervals.

2. Within each subinterval, generate n/M uniform random numbers umi ~ U[(m — 1)/n, m/n], m = 1, 2,…, M; i = 1, 2,…, n/m.

3. Compute Xmi = F—l(Umi).

4. Calculate G according to Eq. (6.89).

Example 6.13 Referring to Example 6.7, apply the systematic sampling technique to evaluate the pump failure probability in the time interval [0, 200 h].

Solution Again, let us adopt the uniform distribution U(0, 200) and carry out the computation by the sample-mean Monte Carlo method. In the systematic sampling, the interval [0, 200] is divided into 10 equal-probability subintervals, each having a probability content of 0.1. Since h(t) = 1/200, 0 < t < 200, the end points of each subinterval can be obtained easily as

tQ = 0, t1 = 20, t2 = 40,…, t9 = 180, t10 = 200

Furthermore, let us generate nm = 200 random variates from each subinterval so that £mnm = 2000. This can be achieved by letting

20(m — 1) 20m 10 , "00”

for i = 1, 2,…, 200; m = 1, 2,… ,10

U„

The algorithm for estimating the pump failure probability is the following:

1. Initialize subinterval index m = 0.

2. Let m = m + 1. Generate nm = 200 standard uniform random variates {um1, um2,…, um,200}, and transform them into the random variates from the corresponding subinterval by tmi = 20(m — 1) + 20umi, for i = 1, 2,…, 200.

3. Compute pf, m as

pf, m = 200 5-/ ft(tmi)

mi = 1

and the associated variance as

p2 s2 0 12s2

m m. m

nm 200

in which sm is the standard deviation of 200 ft(tmi) for the mth subinterval.

4. If m < 10, go to step 2; otherwise, compute the pump failure probability as

1 10

P f = 10 E p f m

m=1

and the associated standard error as

“I 1/2

m=1

The results from the numerical simulation are shown below:

m	p f, m	sm	m	p f, m	sm
1	0.15873	0.00071102	6	0.14659	0.00066053
2	0.15626	0.00069358	7	0.14423	0.00064361
3	0.15374	0.00069298	8	0.14194	0.00064993
4	0.15121	0.00072408	9	0.13968	0.00066746
5	0.14887	0.00065434	10	0.13742	0.00067482
	All 0.14787	0.15154 x 10—5

The value of pf is extremely close to the exact solution of 0.147856.

Library builder

Library builder

Library builder

Library builder

Categories

Distributing of pipes

Soldering of metalplastic pipes

Visually -

Miniatyurizm of a modern shower cabin: style and convenience

Shower, sauna, disco

Concrete baths and Dade Design company sinks.

With what to begin repair in a bathroom

How truly to choose a bath

Brand new bath: elegantly and especially

Bathe in a bath in pleasure!

Stratified sampling technique

Leave a reply Cancel reply