sigma

Probabilistic Characteristics of M, c, I, f_y, and s

The probabilistic characteristics of s were calculated by statistical analyses on all four variables involved in s calculation as well as on s itself. After the initial set of random numbers was generated, histograms were created for M, c, I, and f_y. This section describes the results for each of the four variables and then also for s.

Histograms and Cumulative Distribution Functions for M, c, I, and f_y

In order to illustrate the distribution types for each of the random values, plots of the cumulative distribution functions and the histograms were generated. We attempted to use the formula from Chapter 2 in Probability, Statistics, and Reliability for Engineers to establish the number of bins initially. The plots generated were not very clear and did not show the smooth relationship that the Histogram tool in Excel produced. Using the formula produced 11 bins and Excel determined 30 bins. The Excel tool bases the bin selection on the amount of dispersion in the data and subjectively creates the bin ranges.

The cumulative distribution functions were calculated using the following Excel equations. For the lognormally distributed variables M and f_y:

LOGNORMDIST(x,m_y,s_y) which calculates the cumulative distribution function,

where m_y = ln(m_x) – ½s_y² and s_y = ln[1+(s_x/m_x)²]. (Ayyub 1997)

For the normally distributed variables c and I:

NORMDIST(x,m,s,true) which calculates the cumulative distribution function,

. (Ayyub 1997)

To view the generated plots of both the cumulative distribution functions and the histograms for c, M, I, and f_y, click on the link below. In all cases, the randomly generated numbers fit very closely with the probabilities calculated for each distribution.

Statistical Analyses for M, c, I, and f_y

Full statistical analyses were performed for the randomly generated variables M, c, I and f_yand are displayed in Table 6. The statistics shown in Table 6 were generated using Microsoft Excel’s Descriptive Statistics tool. The means for each variable sample were calculated by the following formula: . (Ayyub 1997) In all four cases, the sample mean is very close to the population mean given in the problem statement. The standard error for each was within 10% of the actual values. The standard deviation for each variable was calculated by the following: . (Ayyub 1997) The standard deviation for all four variables also fell within an acceptable range. Calculations for M yielded the highest standard deviation and standard error of the four variables, but they were still within reason. Sample variance was calculated as S². The skewness was also measured for each of the variables. The formula used to calculate skew is as follows: . (Ayyub 1997) Other information recorded by Microsoft Excel’s Descriptive Statistics tool included kurtosis, maximum, minimum, sample count, and confidence level. As a result of the statistical analyses of M, c, I, and f_y, no significant discrepancies exist in the random numbers generated for each. All numbers fall within a reasonable range of error and therefore are reasonable for calculating s.

Statistics of Random Variables (Table 6)

Discussion of Histogram, CDF, and Statistical Analysis for s

Using the random variables generated for M, c, and I, the corresponding s values were calculated using the formula given in the problem statement: . The resulting s values then were plotted as a histogram and the cumulative distribution function for each bin was calculated assuming that s followed a lognormal distribution. The probabilities calculated for s based on the CDF function discussed earlier were very close to the histogram bins when plotted together in Figure 5. The statistical analysis (see Table 6), performed in the same manner as described previously, also showed that the calculated s values fell within a reasonable range of error and, therefore, further analysis could be completed.

Histogram of sigma (figure 5)

Chi-Square Test for Goodness of Fit for s

Visual inspection of the Sigma Histogram suggests a lognormal distribution of s. To determine if this assumption was correct, a chi-square test for goodness of fit was performed on s. The first step for the chi-square test was to formulate the hypotheses (indicated by the following):

H_o: s_stress ~ LN(m,s)

H_A: s_stress ¹ LN(m,s)

Next, the appropriate model must be selected. Since this is the chi-square test, that’s the model used to test the hypotheses. Then, the test statistic must be identified. It is a random variable and is as follows: . A level of significance must then be chosen. The level of significance chosen for this test was 5%. Next an estimate of the test statistic must be calculated. The estimate of the test statistic is computed with the procedure shown in Table 7-1 and the histogram in Figure 5. In order to determine the goodness of fit, a region of rejection must be defined. For degrees of freedom = 18 and a level of significance = 5%, the critical value of the test statistic obtained from table A-3 in Probability, Statistics, &Reliability for Engineers is 28.877. The region of rejection consists of all values of the test statistic greater than 28.877. Once the test statistic has been evaluated, the appropriate hypothesis must be selected. Because the computed value of the test statistic is less than the critical value, the null hypothesis is accepted and the lognormal distribution for s is appropriate.

Chi-square test (Table 7-1)

Histogram of sigma (Figure 5)