I would like to create a list of words that are in a single long document. Having a word count included would be helpful, too.
The purpose is to create a list of keywords that I can use to describe the items described in this document (it's a list of manuscript descriptions), with the eventual purpose of mapping those words to my own (very short) list of descriptors. But before I can map, I really need to know what words are used throughout the document.
Is there a tool I might use to do this, or some XSLT or Perl script?