Professional Documents
Culture Documents
By Fern Halper
Different vendors often use different terms to describe this kind of information.
Some vendors talk about facts, themes, topics, and events. It is important to
understand what each vendor offers. For instance, perhaps entity extraction
alone will not be useful to your organization or maybe the text analytics vendor
does not offer sentiment capabilities out of the box.
3. You may need to consider a taxonomy. In common usage, a taxonomy is
a method for organizing information into hierarchical relationships. This is
important in text analytics, especially when you're dealing with specific
vocabularies in certain industries. For instance, you may create a taxonomy
about products and services or about certain kinds of diseases.
The taxonomy can also use synonyms and alternate expressions. For instance,
"yearly increase" might all be referring to "raises." Some vendors will provide
baseline taxonomies out of the box, but don't expect that they will work out of
the box. Some vendors will tell you that you don't need a taxonomy -- that they
work off of already created sematic networks that represent the world or that
they have developed techniques that can get around this. For certain subjects,
you may get away without building a taxonomy, but be prepared to iterate on
what comes out of the tool in order to create your own categories.
4. You can analyze the data separately or marry it with structured
data. Organizations that use text data will often integrate it with traditional
data sources to analyze it. They view it as simply another form of data.
Analyzing text data without merging it with other data in your systems can also
be quite informative. For instance, analyzing social media data is often done
this way. Some organizations are even creating predictive models with text
data that are just as good as or better than those that use both text and
traditional structured data. It really depends on the kind of data you want to
analyze and what business problems you're trying to solve.
5. A different mindset is required for analyzing text data. Text analytics
does not have the same level of accuracy as some statistical techniques. It is
best to think of it as being directionally correct, so it is important to go into the
analysis with that perspective. The level of actual analytical skills is going to
depend on the problem you're trying to solve. Generally, understanding natural
language processing is not a prerequisite for text analytics, although some
training on the text analytics tool will be necessary.
Learn More
Interested in text analytics? Want to try it out for yourself? Consider attending
some of the hands-on workshops at the TDWI Analytics Experience July 26-31,
2015 in Boston or read the TDWI Checklist Reports Eight Steps for Using
Analytics to Gain Value from Text and Unstructured Content and How to Gain
Insight from Text.
Related Articles