Evaluating the
Raw Data
Dealing with Accuracy and Reliability.
You can never assume that all data is accurate or that all sources are equally
as reliable. If you make that assumption, by not probing both accuracy and
reliability, you may be generating CI out of
a concoction of very good data, marginally correct data, bad data, and even
disinformation.
Even if you find that you can draw some sort of conclusion from
that mishmash, you risk drawing the wrong conclusion. However, once you have a
sense of the relative accuracy of the individual pieces of data, you can then
begin to analyze all your data properly.
There are three basic elements in evaluating the accuracy of raw
data:
-
Identify the actual source of the data so you can evaluate
the reliability of the source.
-
Estimate the data's accuracy so you can classify the
data.
-
Eliminate false confirmations.
In CI, reliability refers to the
believability of the source of your data. You often can estimate how much you
can believe any data coming from a particular source, based on that source's
past performance. On the other hand, accuracy relates to
the correctness of the particular piece of data you have. There, you are
estimating how correct the data is, based on factors such as whether it is
confirmed by data from a reliable source as well as the reliability of the
original source of the data.
Reliability of the Source. To evaluate the
probable accuracy of any raw data, you must have at least a sense of its
ultimate source. That helps you figure out why the data was produced, collected,
and even released.
You should assume that all data is produced and released to
advance some particular purpose or to be read by a particular audience. You must
not ignore the origins of data, either. Data is only as good as its source. Keep
in mind the following informal rule: Unless you establish otherwise, assume that every place from which you get data has its own
point of view, which permeates any data from that source.
Estimate the Data's Accuracy. Once you have
assessed the reliability of the data source you are looking at, you next
estimate its accuracy. This can involve formally classifying the data you get as
to relative degrees of accuracy or doing so on an informal basis, based on past
experience. As you collect more data and involve others in collecting and
analyzing it, you might find that you need some systematic way of labeling
individual pieces of raw data to identify both their likely accuracy and the
probable credibility of their sources. [6]
For example, information you get from the current distributors for
a target firm may or may not be reliable. That data's reliability depends on the
attitude of the particular distributors toward their supplier, their view of the
use to which the information they are giving to you may eventually be put, and
their access to current data of the sort you are getting from them. Similarly, data from a competitor's
suppliers, even those with whom your firm also does business, may be influenced
by conflicting factors. These influences include the desire to please you,
reluctance to discuss other customers, or a lack of perception on the part of
the individual providing you with the data.
The past track record, if available, of both individual and
institutional sources of data is generally a good basis on which to estimate
current reliability. A supplemental test is one dealing with the likelihood that
your source could actually have the specific data you have obtained from it.
That is, ask yourself whether, under the conditions facing the specific source,
that source could have actually obtained the specific data within the
limitations of time, access, and financing that it faced.
Eliminate False Confirmations. Confirming
data from one source with data from another lets you assess the original data's
accuracy. In the long run, it also helps you to be able to assess the
reliability of its source. A false confirmation is a situation in which one
source appears to confirm data obtained from another source. In fact, however,
there may be no real confirmation, because the first source obtained its data
from the second source or they both received it from the same third source.
Pseudo-Precision. Be careful that you
understand exactly what a data source has said and has not said. Many businesses
and industries use a private language. Sometimes, this provides real clarity and
precision for insiders. In other cases, it only serves to keep outsiders,
including you, from understanding what is actually going on. The jargon may even
be intended to create desired impressions in targeted groups.
Assess the Consistency of the Data. Merely
because you have consistent data does not mean that you can immediately draw a
conclusion based on it. When your research seems to provide consistent
estimates, it can mean one of several things:
-
The data and your conclusions really are valid.
-
No one ever questions this particular "revealed truth."
-
All the data has a common source, so there is no real
confirmation, merely a false confirmation.
Look at the data in question and analyze it, keeping all these
possibilities in mind. Make sure you know why data is consistent before you rely
on it.
In evaluating consistency or dealing with possible
inconsistencies, one of the easiest mistakes to make is to confuse similar terms
that are really used to mean widely
differing things. You can avoid this by paying careful attention to definitions
and terminology.
Anomalies. An anomaly is a situation when
data does not fit. It is usually an indication that one's working assumptions
are wrong or that an unknown factor is affecting the results you have found.
Always review anomalies and try to figure out why they occurred. Something out
of the ordinary should not be automatically rejected as an aberration or even a
mistake.
If you spot a possible anomaly, of course you should first ensure
that it is not actually a mistake in the way the data was presented or
collected, such as transposed numbers or a misquotation. If it is not a mistake
in that sense, look for other data to indicate that this is something that is
true or could be true in the future.
What you are doing is actually attacking your assumptions by using
the anomaly to test them. The existence of an anomaly may indicate that your
basic assumptions about what is true or possible are not correct.
Keeping alert for anomalies has another benefit. Specifically, by
doing so, you help prevent yourself from falling into a common trap for those
involved in handling intelligence: the predisposition to subconsciously reject a
deviation from a known trend or situation until a new trend or situation has
been conclusively established.
Business disinformation. Because CI involves
converting your data into a cohesive picture, you must beware of business
disinformation, more now than ever. [7] Business disinformation is something that looks like
information but is not. For CI purposes, business disinformation is defined as
incomplete or inaccurate information designed to mislead others about your
intentions or abilities. Business disinformation is, however, not the same as puffing, a generally accepted form of advertising
"overstatement," which falls short of fraud. Business disinformation can be
created intentionally, and aids in misleading competitors and others with
erroneous or exaggerated information. It can be generated simply by concealing
relevant information. In each case, the business disinformation is aimed at
establishing false value judgments, creating erroneous impressions, diverting
attention from defects or problems, or hiding facts. This is just a way of
looking at business disinformation based on its content (or lack of content). It
can also, but rarely, happen by accident.
Being aware that business disinformation really exists and trying
to decide whether a competitor is using it can be complex.
-
If you don't consider whether a key piece of data represents
business disinformation and, in fact, it does, this failure can be destructive. Moreover, you may not recognize its
destructive effect until it is too late to counteract it.
-
If you look for the business disinformation, you may not
spot it even if it is present. In that case, your intelligence analysis could be
affected by the business disinformation, but in a direction and to a degree you
cannot predict.
-
You may find what you think is business disinformation when
it is not really there. In that case, you simply become more suspicious about
the credibility you assign to what is really accurate data and more reluctant to
rely on it without further confirmation.
-
You may be correct in spotting the business disinformation.
In that case, handling it properly allows you to avoid its damaging effects on
your CI analysis.
If you have identified data that appears to be business
disinformation, you should handle it as follows:
-
Is the reason for your concern the source of the data or the
nature of the data itself? If your concern is due to a questionable data source,
you should look for other sources to verify the data, avoiding those that might
a false confirmation. If your reason arises out of concern in the nature of the
data itself, you should seek confirmation or contradiction from all sources,
including the original source.
-
Seek alternative sources of data to confirm, or discredit,
the possible business disinformation. Be very sensitive to the danger of false
confirmation here.
-
If you are not sure whether the data is business
disinformation, try to estimate the likelihood of its accuracy and then
explicitly assign a probability of accuracy to it. This may allow you to use the
data, even while there is a question about its validity.
-
Analyze why the potential business disinformation was
created or allowed to continue. If you cannot see a reason why the source would
have created it or permitted it to exist, it may not be business disinformation.
On the other hand, if you can determine why it may have been created or allowed
to continue, you may not only have identified it as business disinformation, but
you may now understand what the source was trying to accomplish.
-
If there remains any question about critical, nonconfirmable
data, it is generally better to treat it as business disinformation.
-
Don't overreact. Be sensitive to the distinction between a
good image and business disinformation. Remember, the success of any business
disinformation initiative requires that you, the target, be willing to be
deceived; that is, you are looking at your competitor through your
preconceptions rather than considering alternatives.