NestedLogit.preprocess_model_data#
- NestedLogit.preprocess_model_data(choice_df, utility_equations)[source]#
Pre-process the model initiation inputs into a format that can be used by the PyMC model.
This method prepares the 3D design matrix
X
, fixed covariate matrixF
(if applicable), and the encoded response vectory
, while also extracting and storing relevant metadata such as alternatives, fixed covariate names, product index mappings, and nesting structures.- Parameters:
- choice_df
pd.DataFrame
A pandas DataFrame containing the observed choices and covariates for each alternative. Each row represents an individual choice observation.
- utility_equations
list
[str
] A list of model formulas, one per alternative. Each formula should be of the form:
"alt_name ~ alt_covariates | fixed_covariates"
. The left-hand side identifies the alternative name; the right-hand side specifies the covariates used to explain utility for that alternative.
- choice_df
- Returns:
- X
np.ndarray
A 3D numpy array of shape (n_observations, n_alternatives, n_covariates), representing the covariate tensor for alternative-specific attributes.
- F
np.ndarray
|None
A 2D numpy array (n_observations, n_fixed_covariates) for covariates shared across alternatives, or None if no such covariates are used.
- y
np.ndarray
A 1D numpy array of encoded target labels (integers), where each entry represents the chosen alternative for an observation.
- X
Notes
Updates internal state: assigns
X_data
,F
,alternatives
,fixed_covar
,y
,
prod_indices
,nest_indices
,all_nests
,lambda_lkup
, andcoords
. - Handles multi-level nesting structures if provided inself.nesting_structure
. - Assumes the existence of instance attributesdepvar
,covariates
, andnesting_structure
.