Active learning has proven to be a successful strategy in quick
development of corpora to be used in training of statistical natural
language parsers.
A vast majority of studies in this field has focused on
estimating informativeness of samples; however, representativeness of
samples is another important criterion to be considered in active learning.
We present a novel metric for estimating representativeness of sentences, based on a modification of Zipf's Principle of Least Effort. Experiments on WSJ corpus with a wide-coverage parser show that our method performs always at least as good as and generally significantly better than alternative representativeness-based methods.