The Discovery View in VisibleThread for Docs, is designed to allow you to 'speed read' or get a quick grasp of the subject matter of a document.
To that end it uses natural language processing to extract all nouns or entities that are contained in the document, such as 'Customer', 'Configuration' etc.
Unfortunately, Natural Language Processing (NLP) is not an exact science. In summary, NLP will mis-classify the Part-of-Speech (POS e.g. noun, verb etc.) for
some pieces of text and the result is a word being incorrectly marked as a noun (or other POS).
Some background to explain why this can occur:
NLP uses a combination of rules (based on grammar, known idiomatic phrases, entities etc.) and statistical methods (commonly called Machine Learning). Language, English in particular, is such a complex beast that there are a lot more exceptions to grammar rules than there are rules ! And that assertion is applied to grammatically complete sentences only !
When you consider grammatically incomplete sentences, you can almost classify that as a separate problem (and to a large extent a different language). The statistical models used to classify words and assign them a POS are trained on a very large corpus of text but will still make some mistakes because it is a statistical method.
For example, in the following two sentences, the words 'Shall' and 'Incorporate' may be classified as nouns:
"Shall be delivered on time"
"Incorporate a trial period"
Statistically, words that start with an uppercase letter are usually either a proper noun or the first word in a sentence. In these cases we have verbs that start a sentence without a subject and are part of a multi-point/bulleted piece of text.
If you encounter any examples of words in the Discovery View that you feel may be incorrectly classified as nouns, let us know. Each example we encounter helps us improve our NLP engine and ultimately deliver better results!