Basic_Operations

Philip I. Pavlik Jr.

2024-07-01

LKT (Logistic Knowledge Tracing) Framework

To use LKT, one needs to have: * Terms of the model, each including a component level that can be characterized by a feature that describes change across repetitions. These are typically just the some of the column headers like the skill, student or item. * A sequence of learner event data with user id (must be “Anon.Student.Id” for studenyt column) and correctness columns (must be “Outcome”, with values as “CORRECT” and “INCORRECT”) (at barest minimum).

It is wise to check the example datasets to see how they is coded. The small sample “samplelkt” has only 4 columns and can be used to create simple models. It illustrates a minimal format for functionality. “largerrawsample” illustrates a typical dataset with more complex components and additional information. The many examples in the Examples webpage on CRAN use this dataset to illustrate a plethora of possible analyses. In the Examples file we also illustrate how to load in Carnegie Learning Cognitive Tutor and Assistments datasets (well know learning systems).

Component Level

The component specifies the subsets of the data (i.e. specified by a column header for the dataset) for which the feature applies. We might think of the most basic component as the student. There are other components as well, such as the items and knowledge components. In the model, the effects for each feature for each component sum together to compute the additive effect of multiple features. Interactions between features are permitted.

Features

These are the functions for computing the effect of the components’ histories for each student (except for the fixed feature, the constant intercept). Some features have a single term like exponential decay (expdecafm), which is a transform using the sequence of prior trials and a decay parameter. Other features are inherently interactive, such as base2, which scales the logarithmic effect of practice by multiplying by a memory decay effect term. Other terms like base4 and ppe involve the interaction of at least 3 inputs.

It should be noted that most features are dynamic in this method. A “dynamic” feature means that its effect in the model potentially changes with each trial for a subject. Most dynamic features start at a value of 0 and change as a function of the changing history of the student as time passes in some learning system.

Feature Types

The standard feature type (except for intercept, which is always “extended”) is fit with the same coefficient for all levels of the component factor. Features may also be extended with the $ operator, which causes LKT to “extend” the feature to fit a coefficient for each level of the component factor. The most straightforward example of this extension is for KCs. Typically, models have used a different coefficient for each knowledge component. For example, in AFM, each KC gets a coefficient to characterize how fast it is learned across opportunities specified in the notation with a $ operator in LKT. If a $ operator is not present, a single coefficient is fit for the feature.

Intercept features can also be modified with the @ operator, which produces random intercepts instead of the default fixed intercepts. However, it can be slow, since it requires a slow R package to implement.

Learner Data Requirements

The LKT model relies on data being in the DataShop format, but only some columns are needed for the models. See the data example below for the minimal format. Data is assumed to be consecutive, grouped by user ids.

Main Function: computeSpacingPredictors

To effectively use computeSpacingPredictors and its dependencies, the input dataset should minimally first contain Anon.Student.Id and CF..ansbin.. Additionally, CF..reltime. and CF..Time. are needed but can be generated if Duration..sec. column is present. Of course, typically you also need a component column like KC or item that you wish to use to compute spacings between repetitions (needed for some features modeling spacing effects).

Example Data

The data set for examples is shown below:

Anon.Student.Id Duration..sec. Outcome KC..Default.
Stu_0391448da5eac00f9b6dd455081aa08e 54 CORRECT 1-3 A norm
Stu_0391448da5eac00f9b6dd455081aa08e 12 INCORRECT 13-3 The u
Stu_0391448da5eac00f9b6dd455081aa08e 23 INCORRECT 14-2 The v
Stu_0391448da5eac00f9b6dd455081aa08e 16 CORRECT 0-3 A dist
Stu_0391448da5eac00f9b6dd455081aa08e 16 INCORRECT 7-3 The me
Stu_0391448da5eac00f9b6dd455081aa08e 25 CORRECT 1-3 A norm
Stu_0391448da5eac00f9b6dd455081aa08e 10 INCORRECT 8-3 The no
Stu_0391448da5eac00f9b6dd455081aa08e 24 INCORRECT 13-3 The u
Stu_0391448da5eac00f9b6dd455081aa08e 23 CORRECT 17-3 When
Stu_0391448da5eac00f9b6dd455081aa08e 9 INCORRECT 4-3 Standa

LKT paper under review please see Pavlik, Eglington, and Harrel-Williams (2021) <arXiv:2005.00869>