statistical_process_control
class
LimitType
Enum for types of control limits
class
ControlLimits
Class to hold control limit specifications
class
WesternElectricRules
Implements Western Electric Rules for Statistical Process Control (SPC).
Supports symmetric, asymmetric, and single-sided control limits.
Rules implemented:
- One point beyond 3 sigma
- 2 out of 3 consecutive points beyond 2 sigma on the same side
- 4 out of 5 consecutive points beyond 1 sigma on the same side
- 8 consecutive points on the same side of the centerline
- 6 consecutive points steadily increasing or decreasing
- 14 consecutive points alternating up and down
- 15 consecutive points within 1 sigma
- 8 consecutive points beyond 1 sigma on either side
Methods
analyze_series
Generate a dataframe containing the original data, the results of various WER rules, and the
control limits. The result returned can be passed to the generate_report method.
analyze_dataframe
Generate a dictionary of dataframes containing the original data, the results of various WER
rules, and the control limits where the keys are the column names being analyzed. The
result can be passed to the generate_report method.
analyze_groupby
Generate a dictionary of dictionaries where the keys are the groups the WER are applied
across. The values are dictionaries of dataframes containing the original data, the results
of various WER rules, and the control limits where the keys are the column names being analyzed.
The result can be passed to the generate_report method.
generate_report
Pass in the results from analyze_series, analyze_dataframe, or analyze_groupby to generate a
pdf report and display control charts. The result can be passed to the generate_report method.
Examples
df = pd.DataFrame(
\{
"x": range(10),
"y": range(20, 30),
"type": 4 * ["a"] + 6 * ["b"],
"exp": [f"EXP\{i\}" for i in range(10)],
"datetime": pd.date_range(start="2023-01-01", periods=10, freq="D"),
\}
)
df = df.set_index(["datetime", "exp"]).sort_index()
# Workflow 1: Use default analysis by calculating distribution parameters and control limits
wer = WesternElectricRules()
res = wer.analyze_series(df["x"])
wer.generate_report("datetime", res)
res = wer.analyze_dataframe(df)
wer.generate_report("datetime", res)
res = wer.analyze_groupby(df.groupby(["type"]))
wer.generate_report("datetime", res)
# Workflow 2: Only show certain rules, update the default limit type but use default analysis
for limit_type in ["upper", "lower", "both"]:
wer = WesternElectricRules(
default_limit_type=limit_type, rules_to_ignore=["rule4", "rule5", "rule6", "rule7", "rule8"]
)
res = wer.analyze_series(df["x"])
wer.generate_report("datetime", res)
# Workflow 3: Fixing the distrution parameters with dictionaries where the key is the group and inner
# dictionaries have keys that are the columns to pass arguemts for to analyze_dataframe
# analyze_groupby gets called with means = \{("b",): \{"x": 5\}\}
# analyze_dataframe gets called with means = \{'x': 5\}.
means = \{("b",): \{"x": 5\}\}
stds = \{("b",): \{"x": 1\}\}
limit_types = \{("b",): \{"x": "lower", "y": "upper"\}\}
wer = WesternElectricRules(rules_to_ignore=["rule4", "rule5", "rule6", "rule7", "rule8"])
res = wer.analyze_groupby(df.groupby(["type"]), means=means, stds=stds, limit_types=limit_types)
wer.generate_report("datetime", res)
# save off bytes from pdf report
report = wer.generate_report("datetime", res, show_plots=True)
output_file_path = "test.pdf"
with open(output_file_path, "wb") as f:
f.write(report)
function
WesternElectricRules.repr
Print the documention of the class.
function
WesternElectricRules.init
Initialize WesternElectricRules with flexible control limit options.
Parameters
default_limit_type : LimitType
Set the default limit type for the analysis. Import LimitType object from this patht
to see examples.
only_flag_last_points : bool
If True, only the last point in a violation sequence will be flagged.
rules_to_ignore : list
List of rules to ignore during analysis.
function
WesternElectricRules.analyze_series
Analyze a pandas Series for control limit violations according to Western Electric Rules.
This function evaluates a given pandas Series against specified control limits to identify
any violations. It allows for optional overrides of the mean, standard deviation, limit
type, and control limits. The function returns a DataFrame that includes the original data,
the results of various control rules, and any control limit values if provided.
Parameters
data : pandas.Series
Pandas Series to analyze.
mean : float, optional
Optional mean value to override instance mean.
std : float, optional
Optional standard deviation value to override instance std.
limit_type : str, LimitType, optional
Optional limit type to override instance limit_type, defaults to default_limit_type
attribute.
control_limits : dict, optional
Optional control limits to override instance limits.
function
WesternElectricRules.analyze_dataframe
Analyze multiple columns in a DataFrame for control limit violations according to Western
Electric Rules.
This function evaluates multiple columns in a given DataFrame against specified control
limits to identify any violations. It allows for optional overrides of the mean, standard
deviation, limit type, and control limits for each column. The function returns a dictionary
containing the analysis results for each column.
This acts as a wrapper for WesternElectricRules.analyze_series() for each column in the
dataframe
Parameters
df : pd.DataFrame
The input DataFrame containing the data to be analyzed.
columns : list of str, optional
The list of column names to be analyzed. If None, all numeric columns
in the DataFrame will be analyzed (default is None).
means : dict of str to float, optional
A dictionary containing the mean values for each column. If not provided,
the mean will be calculated from the data (default is {}).
stds : dict of str to float, optional
A dictionary containing the standard deviation values for each column.
If not provided, the standard deviation will be calculated from the data
(default is {}).
limit_types : dict of str to Union[str, LimitType], optional
A dictionary specifying the type of control limits to be used for each column.
The value can be a string or an instance of the LimitType class (default is {}).
control_limits : dict of str to dict of str to float, optional
A dictionary containing the control limits for each column. The keys are
column names and the values are dictionaries with control limit names as
keys and their corresponding values (default is {}).
Returns
dict
A dictionary containing the analysis results for each column. The keys
are column names and the values are the results of the analysis.
Notes
The function performs the following steps:
- If
columns
is None, selects all numeric columns in the DataFrame. - Iterates over the specified columns.
- For each column, checks if it exists in the DataFrame and is of numeric type.
- Calls the
analyze_series
method to perform the analysis on the column. - Stores the analysis results in a dictionary and returns it.
Examples
>>> df = pd.DataFrame(\{
... 'A': [1, 2, 3, 4, 5],
... 'B': [5, 4, 3, 2, 1]
... \})
>>> wer = WesternElectricRules()
>>> results = wer.analyze_dataframe(df)
>>> print(results)
function
WesternElectricRules.analyze_groupby
Analyze groups in a GroupBy object against control limits according to Western Electric
Rules.
This function evaluates multiple groups in a given GroupBy object against specified control
limits to identify any violations. It allows for optional overrides of the mean, standard
deviation, limit type, and control limits for each column within each group.
The function returns a dictionary containing the analysis results for each group.
This acts as a wrapper for WesternElectricRules.analyze_dataframe() for each group.
Parameters
groupby_obj : pd.core.groupby.GroupBy
The GroupBy object containing the groups to be analyzed.
columns : list of str, optional
The list of column names to be analyzed. If None, all numeric columns
in each group will be analyzed (default is None).
means : dict of str to dict of str to float, optional
A dictionary containing the mean values for each column in each group.
The keys are group names and the values are dictionaries with column
names as keys and their corresponding mean values (default is {}).
stds : dict of str to dict of str to float, optional
A dictionary containing the standard deviation values for each column
in each group. The keys are group names and the values are dictionaries
with column names as keys and their corresponding standard deviation
values (default is {}).
limit_types : dict of str to dict of str to Union[str, LimitType], optional
A dictionary specifying the type of control limits to be used for each
column in each group. The keys are group names and the values are
dictionaries with column names as keys and their corresponding limit
types (default is {}).
control_limits : dict of str to dict of str to dict of str to float, optional
A dictionary containing the control limits for each column in each group.
The keys are group names and the values are dictionaries with column
names as keys and their corresponding control limit dictionaries
(default is {}).
Returns
dict
A dictionary containing the analysis results for each group. The keys
are group names and the values are dictionaries with column names as
keys and their corresponding analysis results.
Examples
>>> df = pd.DataFrame(\{
... 'A': [1, 2, 3, 4, 5],
... 'B': [5, 4, 3, 2, 1],
... 'group': ['X', 'X', 'Y', 'Y', 'Y']
... \})
>>> groupby_obj = df.groupby('group')
>>> wer = WesternElectricRules()
>>> results = wer.analyze_groupby(groupby_obj)
>>> print(results)
function
WesternElectricRules.generate_report
Generate a report by processing the results and generating plots. A pdf report is generated
as bytes for saving off to storage.
Accepts the results from analyze_series, analyze_dataframe, or analyze_groupby to generate a
pdf report and display control charts. An x variable must be given to assign to the xaxis
in the charts.
Parameters
x_var : str
The name of the column to be used for the x-axis in the plots.
results : dict
A dictionary containing the results to be processed. The dictionary can
contain nested dictionaries and DataFrames.
show_plots : bool, optional
Display plots to view
Returns
bytes
Generate the plots and save the plots to a pdf as bytes
Examples
>>> results = \{
... 'group1': pd.DataFrame(\{
... 'x': range(10),
... 'value': [1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
... \}),
... 'group2': \{
... 'subgroup1': pd.DataFrame(\{
... 'x': range(10),
... 'value': [2, 3, 2, 3, 2, 3, 2, 3, 2, 3]
... \})
... \}
... \}
>>> wer = WesternElectricRules()
>>> wer.generate_report(x_var='x', results=results)