Predatory Data Science
Data Science isn't the only technology that can be employed to prey on vulnerable populations, but it can take it to the next level. And more importantly, it can create predatory situations without the explicit knowledge of its operators.
In classical marketing, the archetypical vulnerable population is that of children. But in online advertising, where data science is often employed, determining age of audience is even more difficult (if it is even attempted) than in marketing over pre-Internet forms of media. E.g. television commercials are more permissive during "late night" television, but on the Internet it's always daytime somewhere in the world.
But really, anyone, even normal adults, could be considered vulnerable depending upon their situation at the time. As MathBabe (Dr. Cathy O'Neil) pointed out last month, payday lenders targeted web surfers desperate for money with usurious interest rates. It's not clear what, if any, online advertising channels were used, let alone whether any machine learning or statistics were used. But even the most straightforward of PPC keyword monitoring and landing page direction would target the vulnerable with personalized messages, giving predatory lenders even more of an upper hand in persuasion.
And that's just money. Consider an opaque box form of machine learning, such as a neural network, used by a gun manufacturer, and it ends up targeting the suicidal by keying off of words that tend to indicate depression. It doesn't even have to be guns -- even razor blades or pharmaceuticals.
The way things could go awry like this is because of the way technology is dehumanizing. Technology can remove the human-to-human contact, and one of the consequences is that ethics goes out the window.
10,000 year history of dehumanization
Data science is far from the first dehumanizing technology.
Although money was invented 10,000 years ago, it's amazing even just reading the Little House on the Prarie series to my children how money was so infrequently used even 150 years ago. Now, the Ingalls family did live as a frontier family by choice, but it's still amazing how rarely money appears in the stories, and how things that are "boughten" are extra-special (with the assumption that most things are made by hand).
In its original and most basic form, money was a way to store up human labor. If someone didn't want to or have the time to grow their own food, they could pay someone and food magically appears. Money dilates time, so they may have forgotten the hot summer under which their grower and seller toiled. Money severs the human connection between the consumer and the producer.
Transportation -- What Happens in Vegas Stays in Vegas
As transportation evolved, it allowed greater and greater degrees of anonymity, with the automobile allowing not only quick and independent travel to where one may be anonymous, but shelter as well. The musical The Music Man epitomizes using transportation to conduct unethical marketing -- the traveling salesman hitting the next unsuspecting town on the train line to ply his snake oil.
More efficiently than transporting bodies, communication transported voices, such that a mere voice (and now mere e-mail or text message) can command armies of labor if enough money is stored up. If there is construction labor involved, for example, there may be no concern given for the safety and welfare of the construction workers.
Automation magnifies all of the above. The late Douglas Englebart, who invented networked mouse-driven hypertext in the 1960's, noted that ...after a certain degree of quantitative change, you almost invariably go into qualitative change. The Supreme Court said something similar this past summer when it stated Cell phones differ in both a quantitative and a qualitative sense from other objects that might be kept on an arrestee's person to contrast with looking at papers an arrestee might happen to be holding.
Data Science is a power magnification tool like automation, but taken to the next level. Data Science allows an e-mail to be well-honed for persuasive purposes, for example. Those with more powerful computers and more powerful algorithms can create an asymmetric situation, especially when coupled with technologies above that avoid face-to-face interaction (though Google Glass may be excepting even that).
Active vs. Passive Predation
The dehumanization described above is mere passive predation. Passive predation is extremely easy to fall into, such as by:
- ...using technology and forgetting the human element
- ...using opaque-box machine learning (such as neural networks). Perhaps ethical use of machine learning in situations of any sensitivity demands glass box self-explaining machine learning techniques such as decision trees.
- ...ignoring historic socially charged issues such as race.
- Underlining that active intent is not required for predation to be present, the U.S. Department of Justice brings lawsuits against consumer lending agencies based on statistical analysis and does not need to prove intent.
- Even if glass box machine learning is used, ethical decisions must be made as to how and whether to implement the results from machine learning. E.g. at Strata in February 2013, Jim Adler presented what he described as very disturbing results: that a person having dark skin color and having a traffic ticket is an almost-assured indicator of that person also committing a felony in the future (yes, he titled his presentation "thoughtcrime").
As with any tool, it can be intentionally used for evil. A fraud spammer using machine learning to improve "conversion rates" is an obvious example.
Slightly less obvious is an otherwise legitimate vendor selling otherwise legitimate products or services, but at significantly above-market prices, using machine learning to target less informed, less intellectually capable, less search-engine skilled, or just less savvy potential customers.
Positive Data Science
As with any technology, data science can be used for good or evil. The invention of the Bessemer Process to mass produce steel brought a proliferation of firearms, but it also brought surgical steel.
Although the Data Science Association Code of Conduct proscribes using Data Science to commit fraud or deceit, perhaps it should be amended to also specify explicitly "passive" forms of fraud, deceit and predation as described above.