API Reference¶
- class edstan.EdStanModel(model: str, **kwargs)[source]¶
This class is a child of
pystan.CmdStanModelthat adds functionality to load common item response models and accept data in common formats to perform MCMC sampling. Only the added functionality is documented here.- __init__(model: str, **kwargs)[source]¶
Initializes an
EdStanModelinstance.Upon instantiating an
EdStanModelinstance, the selected model is prepared for sampling. Afterwards, theEdStanModel.sample_from_long()orEdStanModel.sample_from_wide()methods may be used to initiate MCMC sampling with Stan.- Parameters:
model – The (partial) file name of an
edstanmodel, with matching based on the start of the file name. Consider specifying “rasch”, “2pl”, “rsm”, “grsm”, “pcm”, or “gpcm”.kwargs – Additional optional arguments passed to the
pystan.CmdStanModelparent class.
- sample_from_dict(data: Dict, **kwargs) EdStanMCMC[source]¶
Sample from the model using a dictionary of data.
Generally it will be more convenient to initialize sampling using the
EdStanModel.sample_from_long()orEdStanModel.sample_from_wide()methods, which prepare the required dictionary based on common data formats.- Parameters:
data – A dictionary of data compatible with the
edstanmodels.kwargs – Additional arguments passed to
pystan.CmdStanModel.sample(), excluding ‘data’. Consider arguments such as ‘chains’, ‘iter_warmup’, ‘iter_sampling’, and ‘adapt_delta’.
- Returns:
A fitted MCMC model.
- sample_from_long(ii: ndarray[Any, dtype[_ScalarType_co]], jj: ndarray[Any, dtype[_ScalarType_co]], y: ndarray[Any, dtype[integer]], integerize: bool = True, **kwargs) EdStanMCMC[source]¶
Sample from the model using response data in the form of several 1D arrays.
This method is appropriate for “long format” item response data in which scored responses are stored in a flat array, and additional flat arrays index the person and item associated with each scored response. This format can accommodate missing responses by removing them beforehand.
- Parameters:
ii – A 1D NumPy array representing the item associated with a response. Must be integers if ‘integerize’ is set to False.
jj – A 1D NumPy array representing the person associated with a response. Must be integers if ‘integerize’ is set to False.
y – A 1D NumPy array representing the scored responses. The lowest value is expected to be zero.
integerize – Whether to convert ‘ii’ and ‘jj’ to index arrays starting at one. This should generally be set to True but need not be if ‘ii’ and ‘jj’ are already formatted this way.
kwargs – Additional arguments passed to
pystan.CmdStanModel.sample(), excluding ‘data’. Consider arguments such as ‘chains’, ‘iter_warmup’, ‘iter_sampling’, and ‘adapt_delta’.
- Returns:
A fitted MCMC model.
- sample_from_wide(response_matrix: ndarray[Any, dtype[integer]] | DataFrame, **kwargs) EdStanMCMC[source]¶
Sample from the model using response data in the form of a 2D array or
pandas.DataFrame.This method is appropriate for “wide format” item response data in which scored response are arrange in a table. Each row represents a person, and each column represents an item.
- Parameters:
response_matrix – A (#persons, #items) 2D array or
pandas.DataFramerepresenting the scored responses. The lowest value is expected to be zero.kwargs – Additional arguments passed to
pystan.CmdStanModel.sample(), excluding ‘data’. Consider arguments such as ‘chains’, ‘iter_warmup’, ‘iter_sampling’, and ‘adapt_delta’.
- Returns:
A fitted MCMC model.
- class edstan.EdStanMCMC(mcmc: CmdStanMCMC, ii_labels: ndarray[Any, dtype[_ScalarType_co]], jj_labels: ndarray[Any, dtype[_ScalarType_co]], max_per_item: ndarray[Any, dtype[integer]])[source]¶
A wrapper around
pystan.CmdStanMCMCthat adds additional methods.This class delegates all unspecified attribute access to the underlying
pystan.CmdStanMCMCinstance viaEdStanMCMC.__getattr__(). This allows it to behave like apystan.CmdStanMCMCobject while also providing custom methods.- __init__(mcmc: CmdStanMCMC, ii_labels: ndarray[Any, dtype[_ScalarType_co]], jj_labels: ndarray[Any, dtype[_ScalarType_co]], max_per_item: ndarray[Any, dtype[integer]])[source]¶
Initializes an
EdStanMCMCinstance.An instance of this class is generated by
ModelMCMC.sample_from_long()orEdStanModel.sample_from_wide(). Though the class may be initialized directly, this is not the intended usage.- Parameters:
mcmc – A fitted
edstanmodel using MCMC.ii_labels – Labels associated with the items.
jj_labels – Labels associated with the persons.
max_per_item – The maximum score per item.
- item_summary(**kwargs)[source]¶
A wrapper around
pystan.CmdStanMCMC.summary()that provides posterior summaries grouped by item.- Parameters:
kwargs – Additional optional arguments passed to
pystan.CmdStanMCMC.summary(), such as ‘percentiles’ and ‘sig_figs’.- Returns:
A summary
pandas.DataFramefiltered to include item and distribution parameters only, having a multi-index that associates parameters with their respective item labels (or the person distribution).
- person_summary(**kwargs)[source]¶
A wrapper around :meth:pystan.CmdStanMCMC.summary that provides posterior summaries grouped by person.
- Parameters:
kwargs – Additional optional arguments passed to
pystan.CmdStanMCMC.summary(), such as ‘percentiles’ and ‘sig_figs’.- Returns:
A summary
pandas.DataFramefiltered to include person parameters only, having a multi-index that associates parameters with their respective person labels.
- edstan.data_from_long(ii: ndarray[Any, dtype[_ScalarType_co]], jj: ndarray[Any, dtype[_ScalarType_co]], y: ndarray[Any, dtype[integer]], integerize: bool = True, extended: bool = False) Dict[source]¶
Create a dictionary compatible with the
edstanmodels from several 1D arrays.In general the
EdStanModel.sample_from_long()method will be sufficient for preparing data of this format and performing sampling. This function may be of interest if a copy of the prepared data is desired.- Parameters:
ii – A 1D NumPy representing the item associated with a response. Must be integers if ‘integerize’ is set to False.
jj – A 1D NumPy array representing the person associated with a response. Must be integers if ‘integerize’ is set to False.
y – A 1D NumPy array representing the scored responses. The lowest value is expected to be zero.
integerize – Whether to convert ‘ii’ and ‘jj’ to index vectors starting at one. This should generally be set to True.
extended – Whether to add additional metadata keys to the output dictionary. This should generally be set to False if called by the user.
- Returns:
A dictionary representing item response data.
- edstan.data_from_wide(response_matrix: ndarray[Any, dtype[integer]] | DataFrame, extended: bool = False) Dict[source]¶
Create a dictionary compatible with the
edstanmodels from a response matrix.In general the
EdStanModel.sample_from_wide()method will be sufficient for preparing data of this format and performing sampling. This function may be of interest if a copy of the prepared data is desired.- Parameters:
response_matrix – A (#persons, #items) array or
pandas.DataFramerepresenting the scored responses. The lowest value is expected to be zero.extended – Whether to add additional metadata keys to the output dictionary. This should generally be set to False if called by the user.
- Returns:
A dictionary representing item response data.