What if Earth’s simplest organisms were also the smartest? What if we could learn new computational methods from them? Biomimicry is the adoption of a biological innovation to solve a biological problem. Neural networks are an example of this as applied to artificial intelligence. Where bacteria are a keystone species, they can do virtually anything. Both intersected greatly in tech innovation and a trained biologist, this area really excites me! I’m launching a literature review whereby I look at research with bacteria that could inform the way we pursue artificial intelligence.
Our Tiny Cousins
Bacteria are the simplest organisms on the planet. They form the lowest branch on the shrub of life. Ancestral bacterial cells probably were not the first life on the planet, but they surely are an early lineage. They are single celled and lack organelles (complex internal structures) but they are ubiquitous: found at the bottom of the oceans, in the hottest thermal pools, in the coldest corners of Antarctica, inside nuclear reactors, and more.
Bacteria have special powers that make them well suited for this type of study. There is tremendous diversity of species, meaning a variety of genes to work with. They replicate fast. Many species can trade genes and acquire genes from their environment (packed into circular molecules called plasmids), and these capabilities can be used to add genes artificially. For there many properties, bacteria are already used industrially (e.g.: consuming wastes) and medically (e.g.: detecting carcinogens with the Ames test).
This series of posts will explore current research in biomimicry and bacteriology as they relate to computers and AI. Posts will go up intermittently, about once every two weeks.
Have you ever seen a robot or animation that was pretty lifelike, but not quite there? Did it frighten or repulse you? If so, you found yourself in the uncanny valley.
The uncanny valley describes the phenomenon whereby we seem to dislike humanoids, either rendered or three dimensional, that are close to lifelike, but falling just short. Our affinity for these objects follows a curve. When an object has no human like traits, it falls to charm us. As human like attributes are added to the device, perhaps cameras and a speaker in the shape of eyes and a mouth, we are endeared by it. Approximating gunsmith form, however, becomes a little much, such that see are beyond uninterested but actively put off by the machine. This is the uncanny valley. Leaping the valley to full lifelike features (as in an actual person) we are again attracted. Mathur and Reichling recently tested this empirically.
Enter Ivan Ivanovich, one of the top five most upsetting things I’ve even seen. He falls well within the uncanny valley, down to having eyelashes. Ivan is a mannequin from the Soviet space program used to test Vostok spacecraft systems, including its ejection system, and carried experiments inside him on two test spaceflights. I imagine a technician walking into a dark room only to turn on the lights and find Ivan sitting there, eyes fixed on him, terrified. His uncanny qualities are certainly reasobable: to understand how spaceflight might affect the human body, the test subject must be as close an approximation as possible. Upon parachuting to earth, the peasants finding him believed he was a downed American spy plane pilot (Francis Gary Powers, a U2 pilot, was shot down and captured shortly before) and attempted to take him prisoner.
The uncanny valley seems to be a paradox: how is it that we are uncomfortable with more humanlike animations or robots. Many camps have weighed in on this: hypotheses have roots in esthetics, psychology, and biology.
Having a background in biology, I’m inclined toward the evolutionary explanation of pathogen avoidance. Humans are primed to steer clear of anything that we associate with disease. It’s no accident that we are repelled by the sight and smell of vomit, rather it’s the product of natural selection. Somewhere in our evolutionary history, some of our ancestors were exposed to vomit; those who kept their distance were more likely to avoid contacting a (possibly deadly) infection, passing the trait on to their progeny, while those who were not repelled by the vomit were less lucky. To that end, when we see something in a person that doesn’t look right, that resembles a sign of infectious disease, we recoil. I believe that the uncanny gulf represents our recoiling from figures that are just human enough to trick our instincts into believing they’re real, but diseased. (As an aside, this evolutionary argument is not a justification for shunning or ignoring the humanity of those who have it appear to have an illness.) Further, that discomfort with figures in the uncanny valley is an automatic reaction lends credibility to it being an instinctual rather than cerebral reaction, in much the same way that pulling one’s hand away from something hot requires no thought about the temperature or consequences of touching the object. Tybur and Lieberman wrote an excellent piece examining the function of disgust.
Where Are We Headed
What does the visual uncanny valley portend? Its very existence raises the question of whether it is a manifestation of a larger phenomenon. In other words, does the uncanny valley extend beyond just the visual realm and into the cognitive? Would a computer’s “thoughts” be off-putting to us if they were a close, but not perfect analog of human cognition? If so, this could be a major stumbling block to widespread AI adoption. After all, a robot can be designed away from human appearance, but in so far as AI aims to mimic human thought, there may be no way around it. All might be moot, however. It seems a younger generation, digital natives, are less bothered by the uncanny valley. Perhaps bridging the uncanny valley is all a matter of familiarity.
Imagine asking, “Hey Alexa, how do I remove a mustard stain?” This tells Amazon that a consumer is both washing clothing immediately, and might be interested in stain removal products (and might even like mustard). Alexa search data offers Amazon more insight into consumer behavior than ever.
Let’s take a step back to Amazon’s earlier foray into extending a feeler into users’ lives: the Dash button. Sure, it made restocking easy, with the ability to place the button at the point of use, perhaps the washing machine for the Tide button. The characteristics of the products associated with the Dash buttons important. They are usually bundles of individually used items (e.g.: laundry detergent pods), at a low unit cost, used over several weeks. At first it seemed that they were purely a matter of convenience, but when they rolled out the Trojan Dash button something leapt out. The products tied to the buttons have another similarity in that they often have a pattern to their use; same time of day, same time of the month, etc. With a point of use button, which the consumer would reasonably push immediately upon using the last, or one of the last, products in the container, Amazon has an opportunity to peer into these use cycles with incredible specificity. The Dash button is all about the data. (The Dash button is for sale, currently $4.99, but conveys an equal credit toward purchases of the tied products, making it effectively free; on the internet, when the product is free, you are the product.)
Back to the condoms. Upon seeing these buttons on offer, I realized that their use patterns, either alone or in concert with other product purchase trends, could tell the company how often, when, and where consumers engage in sexual activity. Amazon may use this data to infer changes in relationship status and/or a decision to have kids. Foolproof? No, but it furnishes more insight. Similarly, maybe Amazon uses laundry detergent data to estimate household size. We can’t know, but it stands to reason that Amazon might.
The Amazon Advantage
Amazon’s Alexa and Google’s Assistant are competing toe to toe, with Assistant winning search interactions and Alexa winning purchase transactions. Both companies are search engines that put products before consumers, with different methods of doing so. Google must understand its consumers so that it can deliver their eyeballs efficiently to advertiser content. Amazon, likewise, must know its customers’ likes so as to serve them the most desirable product listings. Because Amazon sells directly to consumers, it is well situated to monetize Alexa interactions, both through the purchase transactions as well as indirectly through insights gleaned from search queries. Thinking back to the hypothetical mustard stain query, Amazon might respond to a request to order stain remover by ranking one ideally suited for mustard at the top of the list. Google could learn the same facts from our hypothetical search through Assistant, but must take additional steps to generate revenue for the company.
We can be sure than Amazon will continue to exploit the troves of data it gathers through Alexa interactions in ways we haven’t imagined.
We can debate whether a computer has artificial intelligence, but this raises the larger question of the meaning of intelligence. This article is hardly the place to review the theories behind intelligence, you’d be reading forever. I like defining intelligence as the ability to solve complex problems with creativity by gathering information, developing knowledge, and executing ideas. Researchers posit a number of areas of intelligence; without going into all of the proposed intelligence types, examples include linguistic, artistic, and numeric, among many more. This raises the interesting question of whether one can be intelligent if he or she excels in some categories but lags in others. Psychologist Charles Spearman’s research in the early 1900s identified g-factor as an underlying general intelligence, a high level concept driving performance on discrete measures. G-factor manifests as the correlation in performance on the discrete intelligence measures; intelligence in one area suggests intelligence in other areas. As an aside, Spearman, having used tens of intelligence metrics, developed factor analysis, whereby several variables are examined to determine whether they move together, thus possibly under control of some other (perhaps unmeasured) driver.
We run into a problem when considering artificial intelligence in the context of different forms of intelligence. Computers are clearly capable on a mathematics ability axis when one considers how numeric intelligence is measured (i.e.: solving math problems), however they fall short with art (screenplays written by computers are more comedy than drama!). Perhaps we need a method of arriving at a computer’s g-factor, if artificial intelligence can even be described with a g-factor.
Defining Artificial Intelligence
Given the complexity of defining intelligence, what can we say of artificial intelligence? I propose that rather than defining artificial intelligence as binary–as a system either having artificial intelligence or not–a system must be considered as having intelligence on continua on multiple axes.
Under such a paradigm, a computer employed to solve Ito calculus problems such as predicted rocket flight trajectories, might score very highly on numeric ability but poorly on self awareness. Self aware robots, likewise, may perform well on inter- and intrapersonal intelligence, but poorly on mathematical intelligence. To measure these systems’ intelligence requires a global review of their skills, maybe this is accomplished by scoring each metric (of how many to be determined) and taking an average. Maybe achieving this requires accepting that there are too many facets of artificial intelligence to reduce it to a single value.
This is more than an academic exercise. Where artificial intelligence is of great interest to consumers, researchers, product designers, healthcare, industry, government and military, and more, we must have a uniform definition, scoring system, and vocabulary to communicate it.
Google Plus Photos is an excellent service for storing, editing, and sharing pics taken with your phone. Unlimited free storage for compressed files, adequate for most smartphone cameras, along with instant upload, and in app editing and sharing make using it a no-brainer. If you use a dSLR or otherwise wish to store super sized files, you can dip into your free storage or purchase more. (I’ve never noticed quality problems with my photos, and I allow my files to be compressed so as to qualify for free, unlimited storage.)
Downsampling is the process through which a large image is compressed. It works by taking several very small pieces of the image and combining them. Imagine a checkerboard where, in full resolution, each cell is rendered either black or white. In downsampling several squares will be combined to yield fewer, larger blocks of some intermediate shade. Through this process, the file shrinks in size as it is called upon to store fewer pieces of information. The cost is blurred lines and muted colors.
In a crime drama, an investigator may ‘enhance’ a pixelated license plate image, for example, with ease to yield crisp numbers. This makes for a great show, but in reality, it’s more likely that the human eye interprets the license plate number from a larger picture. As downsampling is taking fewer ‘samples’ of an image so as to represent it in fewer pixels, upsampling (interpolation) is the process of going from a low quality image to a higher quality rendering.
Humans can (somewhat) follow the lines of the image, block by block, to fill in the missing curves and sharpen colors in the mind. Asking a computer to do so is a taller order.
Computers lack the human intuition to say that a fuzzy figure is a ‘3’ or an ‘8’ in a grainy picture. But what if computers could be trained to recognize the patterns that result from downsampling various shapes? Then could they backfill the missing detail to sharpen up those compressed pictures? Enter machine learning.
Google is training its brain to recognize just such patterns so that it can fill in detail missing from compressed images. Its process is RAISR or Rapid and Accurate Image Super-Resolution. Pairs of images, a high resolution and low resolution, are used to train Google’s computers. The computers search for a function that will, pixel by pixel, convert the low resolution image back to (or close to) the original high resolution image. After training, when their computers see a low resolution photo, they hash it. In hashing, each piece of information is combined through a mathematical operation to come up with a unique value, the hash value, that can be compared against the hash values of other, known images which were computed similarly. From this comparison, Google’s computers ascertain which function is required to convert the particular image (or perhaps piece of image) back to high resolution.
We can imagine a schema where low resolution image is downloaded to a user device and hashed locally on the phone, etc. The device could then send the hash value back to the Google mother ship, retrieve the required formulas, and implement them locally, generating a very high quality picture. Google says the process will be something along these lines, cutting file size by 75%.
The Next Step
What could Google have in mind with this technology? Clearly they are deploying it to allow full resolution Google Photo image download with lower data burden. But is there anything else? Perhaps they see it used more universally with Chrome, whereby any picture on the web is compressed, downloaded, and then upsampled, making webpages load faster. Or perhaps they will pair it with their unlimited photo storage option, allowing users to store a ‘pseudo’ high resolution photo that exists in the ether as a compressed file, but appears on the screen as full size.
American consumers are beginning to earnestly adopt smart home technology. CNET reports that market researcher Juniper believes that this year (2017) will see $83 billion in revenue from smart home devices and services. With the US population at 319 million (unclear if the estimates are for US or worldwide sales), that is $261/person! Annual business, they forecast, will hit $195 billion by 2021.
The figures include consumer and business to business sales, but the report holds that smart appliance and home automation will be among the biggest winners.
Google, Amazon, Apple, and Samsung are expected to gain ground in the sector. They have both a base of existing devices and access to cloud services. This is true especially of Amazon with its well received Echo with smart assistant Alexa and its lucrative cloud business line.
The ability of the companies to engage third parties marketing complementary goods and services will, in my mind, be pivotal in the firms’ success or failure. Moreover, whether the companies can build technologies that naturally integrate seamlessly into our lives will prove essential to success. For example, a refrigerator that automatically prompts me to buy milk when my carton runs low adds more value than a refrigerator that will tell me how much milk I have only when asked.
The surging demand for smart home technology will induce more players to bring innovative solutions to market in the coming years. Who will succeed and who will fall by the wayside? Time will tell.
Disengagement frequency is a key measure of self driving car success, tracked by the state of California. A disengagement occurs when a self driving car cedes control to its human shepherd; self driving car manufactures (with vehicles on the road in California) report total mileage operated and the number of disengagements. With these two numbers, division gives the mean miles between disengagements, where higher is better.
Waymo vehicle reliability, as reported by Forbes, blows the other manufacturers out of the water.
Between September 2014 and December 2015, Waymo cars made it, on average, 5318 miles between disengagements while Delphi cars averaged 42 miles, and Nissan only 14 miles. Waymo reliability over than period improved from the prior period by a factor of almost seven.
At the high-level, disengagement frequency doesn’t differentiate by cause or severity. For example, a human operator override due to the self driving car being overly cautious would be lumped in with an override due to the car failing to react to a hazard. California DMV data shows that the most common disengagement incurred by Waymo vehicles, occurring 35% of the time, was due to a “perception discrepancy” defined as “a situation in which the [self driving car]’s sensors are not correctly perceiving an object (e.g., perceiving overhanging branches as an obstacle)”.
On the Learning Curve
I have long wondered how a self driving car will respond to stop signs. Specifically, when arriving at a fixed stop sign, the driver is free to proceed once she has verified that she has the right of way and it is safe to enter the intersection. However, when she approaches a stop sign held by a flagger, she is to stop as long as the stop sign is displayed, regardless of whether there are oncoming cars. A self driving car must both perceive the stop sign as well as understand the context in which its displayed.
We can infer a steep learning curve for self driving cars and their developers. Relatively poor disengagement performance by Delphi and Nissan isn’t necessarily a death knell. To the extent that they can learn from their disengagements, they may be able to recover.
How do Waymo vehicles compare to human drivers? This is a tough one. Keep in mind that disengagements include scenarios that could lead to a collision. People do not necessarily notice (or if they do, they may deny) their near misses, whereas Waymo vehicles and their drivers do. It’s natural, and potentially dangerous, to overestimate ones driving skills; when asked who in a room is an above average driver, most hands would go up. Human drivers do not have a bona fide co-drivers (backseat drivers, maybe) who could takeover in an unsafe situation in the same way that Waymo cars do. All told, we could piece together collision and traffic violation data to get an estimate of human driver capability, but this would no doubt be an overstatement.