Predicting Instructor's Intervention in MOOC forums

Instructor intervention in student discussion forums is a vital component in Massive Open Online Courses (MOOCs), where personalized interaction is limited.

Ubiquitous computing and easy access to high bandwidth internet have reshaped the modus operandi in distance education towards Massive Open Online Courses (MOOCs).

To the best of our knowledge, the problem of predicting instructor’s intervention in MOOC forums has not been addressed yet.

In this section, we explain our models in detail.

This section describes our experiments.

Appears in 8 sentences as: Logistic Regression (1) Logistic regression (1) logistic regression (6)

In *Predicting Instructor's Intervention in MOOC forums*

- The first uses a logistic regression model that primarily incorporates high level information about threads and posts.Page 1, “Introduction”
- 3.2 Logistic Regression (LR)Page 3, “Intervention Prediction Models”
- Our first attempt at solving this problem involved training a logistic regression for the binary prediction task which models P(r|t).Page 3, “Intervention Prediction Models”
- Our logistic regression model uses the following two types of features: Thread only features and Aggregated post features.Page 3, “Intervention Prediction Models”
- p,- and h,- represent the posts of the thread and their latent categories respectively; 7“ represents the instructor’s intervention and gb(t) represent the nonstructural features used by the logistic regression model.Page 4, “Intervention Prediction Models”
- The logistic regression model is good at exploiting the thread level features but not the content of individual posts.Page 4, “Intervention Prediction Models”
- In addition to the nonstructural features used by the logistic regression model (Sec.Page 6, “Intervention Prediction Models”
- We can see that the chain based models, Linear Chain Markov Model (LCMM) and Global Chain Model (GCM), outperform the unstructured models, namely Logistic regression (LR) and Decision Trees (J48).Page 7, “Empirical Evaluation”

See all papers in *Proc. ACL 2014* that mention logistic regression.

See all papers in *Proc. ACL* that mention logistic regression.

Back to top.

Appears in 5 sentences as: regression model (5)

In *Predicting Instructor's Intervention in MOOC forums*

- The first uses a logistic regression model that primarily incorporates high level information about threads and posts.Page 1, “Introduction”
- Our logistic regression model uses the following two types of features: Thread only features and Aggregated post features.Page 3, “Intervention Prediction Models”
- p,- and h,- represent the posts of the thread and their latent categories respectively; 7“ represents the instructor’s intervention and gb(t) represent the nonstructural features used by the logistic regression model .Page 4, “Intervention Prediction Models”
- The logistic regression model is good at exploiting the thread level features but not the content of individual posts.Page 4, “Intervention Prediction Models”
- In addition to the nonstructural features used by the logistic regression model (Sec.Page 6, “Intervention Prediction Models”

See all papers in *Proc. ACL 2014* that mention regression model.

See all papers in *Proc. ACL* that mention regression model.

Back to top.

Appears in 4 sentences as: Cross Validation (1) Cross validation (1) cross validation (2)

In *Predicting Instructor's Intervention in MOOC forums*

- The values of various parameters were selected using 10-fold Cross Validation onPage 6, “Empirical Evaluation”
- 5, plots the 10-fold cross validation performance of the models with increasing values of H for the two datasets.Page 8, “Empirical Evaluation”
- Figure 5: Cross validation performances of the two models with increasing number of categories.Page 8, “Empirical Evaluation”
- 6 shows 10-fold cross validation F-measure of the positive class for LR when different types of features are excluded from the full set.Page 8, “Empirical Evaluation”

See all papers in *Proc. ACL 2014* that mention cross validation.

See all papers in *Proc. ACL* that mention cross validation.

Back to top.

Appears in 3 sentences as: F-measure (3)

In *Predicting Instructor's Intervention in MOOC forums*

- Since the purpose of solving this problem is to identify the threads which should be brought to the notice of the instructors, we measure the performance of our models using F-measure of the positive class.Page 6, “Empirical Evaluation”
- F-measurePage 8, “Empirical Evaluation”
- 6 shows 10-fold cross validation F-measure of the positive class for LR when different types of features are excluded from the full set.Page 8, “Empirical Evaluation”

See all papers in *Proc. ACL 2014* that mention F-measure.

See all papers in *Proc. ACL* that mention F-measure.

Back to top.

Appears in 3 sentences as: feature vector (3)

In *Predicting Instructor's Intervention in MOOC forums*

- Assuming that p represents posts of thread 75, h represents the latent category assignments, 7“ represents the intervention decision; feature vector , qb(p, 7“, h, t), is extracted for each thread and using the weight vector, w, this model defines a decision function, similar to what is shown in Equation 1.Page 5, “Intervention Prediction Models”
- Each row is a category and each column represents a feature vector .Page 7, “Empirical Evaluation”
- While the actual size of vocabulary is huge, we use only a small subset of words in our feature vector for this visualization.Page 7, “Empirical Evaluation”

See all papers in *Proc. ACL 2014* that mention feature vector.

See all papers in *Proc. ACL* that mention feature vector.

Back to top.

Appears in 3 sentences as: iteratively (3)

In *Predicting Instructor's Intervention in MOOC forums*

- The model uses the pseudocode shown in Algorithm 1 to iteratively refine the weight vectors.Page 5, “Intervention Prediction Models”
- Exploiting the semi-convexity property (Felzenszwalb et al., 2010), the algorithm works in two steps, each executed iteratively .Page 6, “Intervention Prediction Models”
- The algorithm then performs two step iteratively - first it determines the structural assignments for the negative examples, and then optimizes the fixed objective function using a cutting plane algorithm.Page 6, “Intervention Prediction Models”

See all papers in *Proc. ACL 2014* that mention iteratively.

See all papers in *Proc. ACL* that mention iteratively.

Back to top.

Appears in 3 sentences as: latent variable (1) latent variables (2)

In *Predicting Instructor's Intervention in MOOC forums*

- pi, 7“ and gb(t) are observed and hi are the latent variables .Page 4, “Intervention Prediction Models”
- In the first step, it determines the latent variable assignments for positive examples.Page 6, “Intervention Prediction Models”
- Once this process converges for negative examples, the algorithm reassigns values to the latent variables for positive examples, and proceeds to the second step.Page 6, “Intervention Prediction Models”

See all papers in *Proc. ACL 2014* that mention latent variables.

See all papers in *Proc. ACL* that mention latent variables.

Back to top.

Appears in 3 sentences as: objective function (3)

In *Predicting Instructor's Intervention in MOOC forums*

- Similar to the traditional maximum margin based Support Vector Machine (SVM) formulation, our model’s objective function is defined as:Page 5, “Intervention Prediction Models”
- Replacing the term fw (253,193) with the contents of Equation 1 in the minimization objective above, reveals the key difference from the traditional SVM formulation - the objective function has a maximum term inside the global minimization problem making it non-convex.Page 6, “Intervention Prediction Models”
- The algorithm then performs two step iteratively - first it determines the structural assignments for the negative examples, and then optimizes the fixed objective function using a cutting plane algorithm.Page 6, “Intervention Prediction Models”

See all papers in *Proc. ACL 2014* that mention objective function.

See all papers in *Proc. ACL* that mention objective function.

Back to top.

Appears in 3 sentences as: weight vector (2) weight vectors (1)

In *Predicting Instructor's Intervention in MOOC forums*

- The model uses the pseudocode shown in Algorithm 1 to iteratively refine the weight vectors .Page 5, “Intervention Prediction Models”
- Assuming that p represents posts of thread 75, h represents the latent category assignments, 7“ represents the intervention decision; feature vector, qb(p, 7“, h, t), is extracted for each thread and using the weight vector , w, this model defines a decision function, similar to what is shown in Equation 1.Page 5, “Intervention Prediction Models”
- w is the weight vector , is the squared hinge loss function and fw (tj, pj) is defined in Equation 1.Page 6, “Intervention Prediction Models”

See all papers in *Proc. ACL 2014* that mention weight vector.

See all papers in *Proc. ACL* that mention weight vector.

Back to top.