Reference: Pfleger, K. On-Line Learning of Undirected Sparse n-grams. Knowledge Systems Laboratory, 2002.
Abstract: Intelligent autonomous agents in complex worlds must be able to identify frequent patterns in the massive data streams of their lives. These patterns serve as general foundations for inferring unseen data and as building blocks for higher-level knowledge representation. We address this need with systems based on n-grams, simple learning models considered state-of-the-art in many sequential domains. Standard n-grams suffer from an exponential number of parameters in the model width, n. We introduce an undirected sparse n-gram model, which stores probability estimates only for some (frequent) n-tuples from the unconditional joint distribution and which outperforms narrower exhaustive n-grams with the same number of parameters. Techniques exist for pruning exhaustive n-grams after training but are inappropriate when on-line learning is required (as when data overwhelms storage capacity). We present an unsupervised, on-line learning algorithm for sparse n-grams which induces both model parameters and structure (which patterns to include) despite the challenging a priori uncertainty of which patterns are frequent. The resulting dynamic pattern inclusion complicates probability estimates and inference, but we present several solutions, including Bayesian and EM approaches. We also discuss multiwidth combinations of sparse n-grams, which are capable of hierarchically representing repeated substructure. They learn on-line with no fixed bound on pattern size, but still using less memory than that taken by the training data. Thus, they learn frequent patterns from data streams incrementally and at multiple size scales, without fitting parameters for all possible patterns and without remembering or iterating over training data.
Full paper available as pdf.