Skip to main content

statistical_process_control

class LimitType

Enum for types of control limits

class ControlLimits

Class to hold control limit specifications

class WesternElectricRules

Implements Western Electric Rules for Statistical Process Control (SPC).
Supports symmetric, asymmetric, and single-sided control limits.

Rules implemented:

  1. One point beyond 3 sigma
  2. 2 out of 3 consecutive points beyond 2 sigma on the same side
  3. 4 out of 5 consecutive points beyond 1 sigma on the same side
  4. 8 consecutive points on the same side of the centerline
  5. 6 consecutive points steadily increasing or decreasing
  6. 14 consecutive points alternating up and down
  7. 15 consecutive points within 1 sigma
  8. 8 consecutive points beyond 1 sigma on either side

Methods

analyze_series
        Generate a dataframe containing the original data, the results of various WER rules, and the
        control limits. The result returned can be passed to the generate_report method.
analyze_dataframe
        Generate a dictionary of dataframes containing the original data, the results of various WER
        rules, and the control limits where the keys are the column names being analyzed. The
        result can be passed to the generate_report method.
analyze_groupby
        Generate a dictionary of dictionaries where the keys are the groups the WER are applied
        across. The values are dictionaries of dataframes containing the original data, the results
        of various WER rules, and the control limits where the keys are the column names being analyzed.
        The result can be passed to the generate_report method.
generate_report
        Pass in the results from analyze_series, analyze_dataframe, or analyze_groupby to generate a
        pdf report and display control charts. The result can be passed to the generate_report method.

Examples

df = pd.DataFrame(  
\{
"x": range(10),
"y": range(20, 30),
"type": 4 * ["a"] + 6 * ["b"],
"exp": [f"EXP\{i\}" for i in range(10)],
"datetime": pd.date_range(start="2023-01-01", periods=10, freq="D"),
\}
)
df = df.set_index(["datetime", "exp"]).sort_index()

# Workflow 1: Use default analysis by calculating distribution parameters and control limits
wer = WesternElectricRules()

res = wer.analyze_series(df["x"])
wer.generate_report("datetime", res)

res = wer.analyze_dataframe(df)
wer.generate_report("datetime", res)

res = wer.analyze_groupby(df.groupby(["type"]))
wer.generate_report("datetime", res)

# Workflow 2: Only show certain rules, update the default limit type but use default analysis
for limit_type in ["upper", "lower", "both"]:
wer = WesternElectricRules(
default_limit_type=limit_type, rules_to_ignore=["rule4", "rule5", "rule6", "rule7", "rule8"]
)
res = wer.analyze_series(df["x"])
wer.generate_report("datetime", res)

# Workflow 3: Fixing the distrution parameters with dictionaries where the key is the group and inner
# dictionaries have keys that are the columns to pass arguemts for to analyze_dataframe
# analyze_groupby gets called with means = \{("b",): \{"x": 5\}\}
# analyze_dataframe gets called with means = \{'x': 5\}.
means = \{("b",): \{"x": 5\}\}
stds = \{("b",): \{"x": 1\}\}
limit_types = \{("b",): \{"x": "lower", "y": "upper"\}\}

wer = WesternElectricRules(rules_to_ignore=["rule4", "rule5", "rule6", "rule7", "rule8"])
res = wer.analyze_groupby(df.groupby(["type"]), means=means, stds=stds, limit_types=limit_types)
wer.generate_report("datetime", res)

# save off bytes from pdf report
report = wer.generate_report("datetime", res, show_plots=True)

output_file_path = "test.pdf"
with open(output_file_path, "wb") as f:
f.write(report)

function WesternElectricRules.repr

Print the documention of the class.

function WesternElectricRules.init

Initialize WesternElectricRules with flexible control limit options.

Parameters

default_limit_type : LimitType
        Set the default limit type for the analysis. Import LimitType object from this patht
        to see examples.
only_flag_last_points : bool
        If True, only the last point in a violation sequence will be flagged.
rules_to_ignore : list
        List of rules to ignore during analysis.

function WesternElectricRules.analyze_series

Analyze a pandas Series for control limit violations according to Western Electric Rules.

This function evaluates a given pandas Series against specified control limits to identify
any violations. It allows for optional overrides of the mean, standard deviation, limit
type, and control limits. The function returns a DataFrame that includes the original data,
the results of various control rules, and any control limit values if provided.

Parameters

data : pandas.Series
        Pandas Series to analyze.
mean : float, optional
        Optional mean value to override instance mean.
std : float, optional
        Optional standard deviation value to override instance std.
limit_type : str, LimitType, optional
        Optional limit type to override instance limit_type, defaults to default_limit_type
        attribute.
control_limits : dict, optional
        Optional control limits to override instance limits.

function WesternElectricRules.analyze_dataframe

Analyze multiple columns in a DataFrame for control limit violations according to Western
Electric Rules.

This function evaluates multiple columns in a given DataFrame against specified control
limits to identify any violations. It allows for optional overrides of the mean, standard
deviation, limit type, and control limits for each column. The function returns a dictionary
containing the analysis results for each column.

This acts as a wrapper for WesternElectricRules.analyze_series() for each column in the
dataframe

Parameters

df : pd.DataFrame
        The input DataFrame containing the data to be analyzed.
columns : list of str, optional
        The list of column names to be analyzed. If None, all numeric columns
        in the DataFrame will be analyzed (default is None).
means : dict of str to float, optional
        A dictionary containing the mean values for each column. If not provided,
        the mean will be calculated from the data (default is {}).
stds : dict of str to float, optional
        A dictionary containing the standard deviation values for each column.
        If not provided, the standard deviation will be calculated from the data
        (default is {}).
limit_types : dict of str to Union[str, LimitType], optional
        A dictionary specifying the type of control limits to be used for each column.
        The value can be a string or an instance of the LimitType class (default is {}).
control_limits : dict of str to dict of str to float, optional
        A dictionary containing the control limits for each column. The keys are
        column names and the values are dictionaries with control limit names as
        keys and their corresponding values (default is {}).

Returns

dict
        A dictionary containing the analysis results for each column. The keys
        are column names and the values are the results of the analysis.

Notes

The function performs the following steps:

  1. If columns is None, selects all numeric columns in the DataFrame.
  2. Iterates over the specified columns.
  3. For each column, checks if it exists in the DataFrame and is of numeric type.
  4. Calls the analyze_series method to perform the analysis on the column.
  5. Stores the analysis results in a dictionary and returns it.

Examples

>>> df = pd.DataFrame(\{  
... 'A': [1, 2, 3, 4, 5],
... 'B': [5, 4, 3, 2, 1]
... \})
>>> wer = WesternElectricRules()
>>> results = wer.analyze_dataframe(df)
>>> print(results)

function WesternElectricRules.analyze_groupby

Analyze groups in a GroupBy object against control limits according to Western Electric
Rules.

This function evaluates multiple groups in a given GroupBy object against specified control
limits to identify any violations. It allows for optional overrides of the mean, standard
deviation, limit type, and control limits for each column within each group.
The function returns a dictionary containing the analysis results for each group.

This acts as a wrapper for WesternElectricRules.analyze_dataframe() for each group.

Parameters

groupby_obj : pd.core.groupby.GroupBy
        The GroupBy object containing the groups to be analyzed.
columns : list of str, optional
        The list of column names to be analyzed. If None, all numeric columns
        in each group will be analyzed (default is None).
means : dict of str to dict of str to float, optional
        A dictionary containing the mean values for each column in each group.
        The keys are group names and the values are dictionaries with column
        names as keys and their corresponding mean values (default is {}).
stds : dict of str to dict of str to float, optional
        A dictionary containing the standard deviation values for each column
        in each group. The keys are group names and the values are dictionaries
        with column names as keys and their corresponding standard deviation
        values (default is {}).
limit_types : dict of str to dict of str to Union[str, LimitType], optional
        A dictionary specifying the type of control limits to be used for each
        column in each group. The keys are group names and the values are
        dictionaries with column names as keys and their corresponding limit
        types (default is {}).
control_limits : dict of str to dict of str to dict of str to float, optional
        A dictionary containing the control limits for each column in each group.
        The keys are group names and the values are dictionaries with column
        names as keys and their corresponding control limit dictionaries
        (default is {}).

Returns

dict
        A dictionary containing the analysis results for each group. The keys
        are group names and the values are dictionaries with column names as
        keys and their corresponding analysis results.

Examples

>>> df = pd.DataFrame(\{  
... 'A': [1, 2, 3, 4, 5],
... 'B': [5, 4, 3, 2, 1],
... 'group': ['X', 'X', 'Y', 'Y', 'Y']
... \})
>>> groupby_obj = df.groupby('group')
>>> wer = WesternElectricRules()
>>> results = wer.analyze_groupby(groupby_obj)
>>> print(results)

function WesternElectricRules.generate_report

Generate a report by processing the results and generating plots. A pdf report is generated
as bytes for saving off to storage.

Accepts the results from analyze_series, analyze_dataframe, or analyze_groupby to generate a
pdf report and display control charts. An x variable must be given to    assign to the xaxis
in the charts.

Parameters

x_var : str
        The name of the column to be used for the x-axis in the plots.
results : dict
        A dictionary containing the results to be processed. The dictionary can
        contain nested dictionaries and DataFrames.
show_plots : bool, optional
        Display plots to view

Returns

bytes
        Generate the plots and save the plots to a pdf as bytes

Examples

>>> results = \{  
... 'group1': pd.DataFrame(\{
... 'x': range(10),
... 'value': [1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
... \}),
... 'group2': \{
... 'subgroup1': pd.DataFrame(\{
... 'x': range(10),
... 'value': [2, 3, 2, 3, 2, 3, 2, 3, 2, 3]
... \})
... \}
... \}
>>> wer = WesternElectricRules()
>>> wer.generate_report(x_var='x', results=results)