Skip to contents

Returns top-N terms for a topic ranked by probability.

Usage

top_terms(mod, topic_id, n_terms)

Arguments

mod

an object with STM or LDA model.

topic_id

topic ID (numeric).

n_terms

the number of terms to return.

Value

A tibble with a ranked list of terms.

Examples

library(stm)

mod <- stm(poliblog5k.docs, 
           poliblog5k.voc, K=25,
           prevalence=~rating, 
           data=poliblog5k.meta,
           max.em.its=2, 
           init.type="Spectral") 
#> Beginning Spectral Initialization 
#> 	 Calculating the gram matrix...
#> 	 Finding anchor words...
#>  	.........................
#> 	 Recovering initialization...
#>  	..........................
#> Initialization complete.
#> ....................................................................................................
#> Completed E-Step (1 seconds). 
#> Completed M-Step. 
#> Completing Iteration 1 (approx. per word bound = -7.020) 
#> ....................................................................................................
#> Completed E-Step (1 seconds). 
#> Completed M-Step. 
#> Model Terminated Before Convergence Reached 

top_terms(mod, 1, 100)
#> # A tibble: 100 × 3
#>    topic term       beta
#>    <int> <chr>     <dbl>
#>  1     1 legisl  0.0483 
#>  2     1 bill    0.0303 
#>  3     1 vote    0.0252 
#>  4     1 senat   0.0212 
#>  5     1 hous    0.0125 
#>  6     1 will    0.0115 
#>  7     1 pass    0.00950
#>  8     1 fisa    0.00703
#>  9     1 support 0.00659
#> 10     1 protect 0.00658
#> # ℹ 90 more rows