Abstract
Instead of using a common PCFG to parse all texts, we present an efficient generative probabilistic model for the probabilistic context-free grammars(PCFGs) based on the Bayesian finite mixture model, where we assume that there are several PCFGs and each of these PCFGs share the same CFG but with different rule probabilities. Sentences of the same article in the corpus are generated from a common multinomial distribution over these PCFGs. We derive a Markov chain Monte Carlo algorithm for this model. In the experiments, our multi-grammar model outperforms both single grammar model and Inside-Outside algorithm. Copyright © 2015 Springer International Publishing Switzerland.
Original language | English |
---|---|
Title of host publication | Computational linguistics and intelligent text processing: 16th International Conference, CICLing 2015, Cairo, Egypt, April 14-20, 2015, proceedings, part I |
Editors | Alexander GELBUKH |
Place of Publication | Cham |
Publisher | Springer |
Pages | 201-212 |
ISBN (Electronic) | 9783319181110 |
ISBN (Print) | 9783319181103 |
DOIs | |
Publication status | Published - 2015 |