Validation of the Shelf Life Macro
A SigmaPlot macro was written to compute a compound’s shelf life from measured time-activity data. The macro uses a SigmaPlot transform. Its design is based on the presentation in the document “Guideline for Submitting Documentation for the Stability of Human Drugs and Biologics, Food and Drug Administration”, DHHS, February 1987.
This document validates the macro by comparing its results to those obtained from SPSS. A variety of different variables are studied in order to put the macro through as wide a use spectrum as possible.
- Number of data points (3, 4, 5, 8, 15, 25)
- Missing values
- Different result types
- Typical (negative slope)
- Positive or zero slope
- No solution due to large data error
- Two intersections of lower confidence and 90% activity lines
A time-activity simulation was created to generate data for each of the variables studied. This allowed varying the number of data points, the slope of the data and the noise level. The time-activity model was a straight line with superimposed Gaussian noise.
The activity data decreased from around 100% and each value was rounded to two decimal places (e.g., 99.54). This was done to approximate the precision of measured activity data and to facilitate copying and pasting data back and forth from SigmaPlot to SPSS.
The shelf life was computed from the macro and then compared to results obtained from the linear regression option of SPSS.
The macro computes the shelf life analytically by solving the equations for the lower 95% confidence line and the 90% activity line.
An iterative procedure was used to compute the shelf life in SPSS. First the simulated data was copied from SigmaPlot to SPSS. Two additional time values that bracketed the shelf life value computed in SigmaPlot were then added to the time column in SPSS. The linear regression was then computed and the 95% confidence levels of the mean were placed into the SPSS worksheet. The lower 95% confidence values for the two additional times then bracketed 90%.
A linear interpolation algorithm was then used to estimate the shelf life time (intersection of the lower 95% mean confidence line with 90% activity) from the two additional time values and corresponding 95% confidence values. The new shelf life was then used in SPSS to refine the two additional time values. This iterative interpolation/refinement process was continued until the time was found for which the lower 95% confidence value equaled 90% to five decimal places (90.00000). The relative percent difference (100*(macro-SPSS)/SPSS) was then computed.
The shelf life time results from both the regression and confidence level computations. Differences in shelf life reflects differences from both the regression and confidence level computations. Thus rather than report and compare regression coefficients, 95% confidence values, etc., it was deemed adequate to present the results in terms on shelf life differences only.
Number of Data Points
The number of data points was varied from 3 to 25 and the results shown in the Table below. Zero, one and two data points were also tested. In these cases the macro prints “n must > 2” in column seven of the worksheet and does not compute any results.
|Table 1. Percentage shelf life difference between the SigmaPlot macro and SPSS.|
This table shows that all percentage differences are less than one hundredth of one percent. In terms of shelf life time there were at most differences in the third decimal place; e.g., for n=4 the macro shelf life time was 114.832 whereas SPSS computed 114.837 months.
If missing values exist the macro rowwise deletes this data point; i.e., both the time and activity values are deleted. Typically the activity level is missing but the macro will rowwise delete if either variable has a missing value. The macro determines which rows have missing values in either variable and then retains only those rows without missing values. The size of the data set is determined after missing values have been rowwise deleted.
If the number of valid rows becomes less than 3 then the “n must > 2” message is printed in the worksheet. The macro was tested by placing missing values (–), blank cells, cells containing text and cells containing symbols (colors, line types, patterns, symbols, etc.) in various numbers and various places in the data set. In all cases the correct number of rows with valid numerical values were obtained.
Different Result Types
Given data with measurement error it is possible that the regression line has a positive slope or zero slope. It is also possible that the lower confidence line intersects the 90% activity line in two places. If the measurement error is large the lower 95% confidence line may lie below the 90% activity line with no intersection. The algorithm must gracefully handle these cases.
Positive or Zero Regression Slope
If a positive or zero slope is detected the algorithm prints “no solution” in the worksheet and does not create the drop lines indicating intersection with the 90% activity line. An example of the graph produced is shown below.
Figure 2. This graph occurs when the regression line has a non-negative slope. The drop lines representing intersection with the 90% activity line are not generated in the worksheet and therefore not displayed on the graph.
No Solution Due to Large Data Error This case is handled exactly like the non-negative slope condition. “no solution” is printed in the worksheet and the drop lines indicating an intersection with the lower confidence line are not generated or graphed. An example of this case is shown in Figure 3.
Figure 3. Large data error can result in a lower confidence line that does not intersect the 90% activity line. The macro displays “no solution” in the worksheet and does not create drop lines representing the intersection of the two lines.
Two Intersections of Lower Confidence and 90% Activity Lines
This situation is a bit contrived but numerically can occur. The macro determines when there are two possible intersections and then selects the correct one. An example is shown in Figure 4.
Figure 4. This example shows two intersections of the lower confidence and 90% activity lines. The macro selects the correct intersection and displays the appropriate drop lines from the intersecting point.
The shelf life macro accurately computes the shelf life time and contains sufficient logic to account for all known unusual data sets. It is very easy to use requiring only data entry and running the macro.