Data Mining May Not Work As Anti Terrorist Tool: Report

While ‘data mining can help reveal patterns and relationships, it does not tell the userthe value or significance of these patterns.’Data Mining May Not Work As Anti-Terrorist Tool: ReportBy Cliff Montgomery – July 27th, 2007A Congressional Research Service Report updated on June 5th, 2007 examines the limitations of data mining as a terrorist-catching tool.We quote from this report below:“Data mining has become one of the key features of many homeland security initiatives. Often used as a means for detecting fraud, assessing risk, and product retailing, data mining involves the use of data analysis tools to discover previously unknown, valid patterns and relationships in large data sets.”Consequently, data mining consists of more than collecting and managing data; it also includes analysis and prediction.”In the context of homeland security, data mining [is being used as] a potential means to identify terrorist activities, such as money transfers and communications, and to identify and trackindividual terrorists themselves, such as through travel and immigration records.”While data mining represents a significant advance in the type of analytical tools currently available, there are limitations to its capability.”One limitation is that although data mining can help reveal patterns and relationships, it does not tell the user the value or significance of these patterns. These types of determinations must be made by the user.”A second limitation is that while data mining can identify connections between behaviors and/or variables, it does not necessarily identify a causal relationship. Successful data mining still requires skilled technical and analytical specialists who can structure the analysis and interpret the output.”In the public sector, data mining applications initially were used as a means to detect fraud andwaste, but have grown to also be used for purposes such as measuring and improving program performance. However, some of the homeland security data mining applications represent a significant expansion in the quantity and scope of data to be analyzed.”Some efforts that have attracted a higher level of congressional interest include the Terrorism Information Awareness (TIA) project (now discontinued) and the Computer-Assisted Passenger Pre-screening System II (CAPPS II) project (now canceled and replaced by Secure Flight).”Other initiatives that have been the subject of congressional interest include the Multi-State Anti-Terrorism Information Exchange (MATRIX), the Able Danger program, the Automated Targeting System (ATS), and data collection and analysis projects being conducted by the National Security Agency (NSA).”As with other aspects of data mining, while technological capabilities are important, there are other implementation and oversight issues that can influence the success of a project’s outcome.”One issue is data quality, which refers to the accuracy and completeness of the data being analyzed.”A second issue is the inter-operability of the data mining software and databases being used by different agencies.”A third issue is mission creep, or the use of data for purposes other than for which the data were originally collected.”A fourth issue is privacy.”Questions that may be considered include the degree to which government agencies should useand mix commercial data with government data, whether data sources are being used for purposes other than those for which they were originally designed, and possible application of the Privacy Act to these initiatives. It is anticipated that congressional oversight of data mining projects will grow as data mining efforts continue to evolve.”While data mining products can be very powerful tools, they are not self-sufficient applications.”[For instance], the validity of the patterns discovered is dependent on how they compare to ‘real world’ circumstances. […] While possibly re-affirming a particular profile, [a certain discovery] does not necessarily mean that the application will identify a suspect whose behavior significantly deviates from the original model.”Another limitation of data mining is that while it can identify connections between behaviors and/or variables, it does not necessarily identify a causal relationship.”Beyond these specific limitations, some researchers suggest that the circumstances surrounding our knowledge of terrorism make data mining an ill-suited tool for identifying (predicting) potential terrorists before an activity occurs.”Successful ‘predictive data mining’ requires a significant number of known instances of a particular behavior in order to develop valid predictive models. For example, data mining used to predict types of consumer behavior (i.e., the likelihood of someone shopping at a particular store, the potential of a credit card usage being fraudulent) may be based on as many as millions of previous instances of the same particular behavior.”Moreover, such a robust data set can still lead to ‘false positives’ [errors].”In contrast…a CATO Institute report suggests that the relatively small number of terrorist incidents or attempts each year are too few and individually unique ‘to enable the creation of valid predictive models.’ “Like what you’re reading so far? Then why not order a full year (52 issues) of thee-newsletter for only $15? A major article covering an story not being told in the Corporate Press will be delivered to your email every Monday morning for a full year, for less than 30 cents an issue. Order Now!

Sign Up for our e-Newsletter

You can expect to stay well ahead of the game, with the tough, insightful reporting of our e-Newsletter. No info-tainment or shouting matches passed off as ‘news’, but the real deal, sent to your personal e-mail every Monday morning, for less than 30 cents an issue.
Sign Up Today!