Artificial intelligence is making it possible for Street Views to be
mined for insights about the economy, politics and human behavior — just as text mining has done for years.
By Steve Lohr
What
vehicle is most strongly associated with Republican voting districts?
Extended-cab pickup trucks. For Democratic districts? Sedans.
Those
conclusions may not be particularly surprising. After all, market
researchers and political analysts have studied such things for decades.
But
what is surprising is how researchers working on an ambitious project
based at Stanford University reached those conclusions: by analyzing 50
million images and location data from Google Street View, the
street-scene feature of the online giant’s mapping service.
For
the first time, helped by recent advances in artificial intelligence,
researchers are able to analyze large quantities of images, pulling out
data that can be sorted and mined to predict things like income,
political leanings and buying habits. In the Stanford study, computers
collected details about cars in the millions of images it processed,
including makes and models.
“All
of a sudden we can do the same kind of analysis on images that we have
been able to do on text,” said Erez Lieberman Aiden, a computer
scientist who heads a genomic research center at the Baylor School of
Medicine. He provided advice on one aspect of the Stanford project.
For
computers, as for humans, reading and observation are two distinct ways
to understand the world, Lieberman Aiden said. In that sense, he said,
“computers don’t have one hand tied behind their backs anymore.”
Text
has been easier for AI to handle, because words have discrete
characters — 26 letters, in the case of English. That makes it much
closer to the natural language of computers than the freehand chaos of
imagery. But image recognition technology, much of it developed by major
technology companies, has improved greatly in recent years.
The
Stanford project gives a glimpse at the potential. By pulling the
vehicles’ makes, models and years from the images, and then linking that
information with other data sources, the project was able to predict
factors like pollution and voting patterns at the neighborhood level.
“This
kind of social analysis using image data is a new tool to draw
insights,” said Timnit Gebru, who led the Stanford research effort. The
research has been published in stages, the most recent in late November
in the Proceedings of the National Academy of Sciences.
In
the end, the car-image project involved 50 million images of street
scenes gathered from Google Street View. In them, 22 million cars were
identified, and then classified into more than 2,600 categories like
their make and model, located in more than 3,000 ZIP codes and 39,000
voting districts.
But first, a database curated by humans had to train the AI software to understand the images.
The
researchers recruited hundreds of people to pick out and classify cars
in a sample of millions of pictures. Some of the online contractors did
simple tasks like identifying the cars in images. Others were car
experts who knew nuances like the subtle difference in the taillights on
the 2007 and 2008 Honda Accords.
“Collecting
and labeling a large data set is the most painful thing you can do in
our field,” said Gebru, who received her Ph.D. from Stanford in
September and now works for Microsoft Research.
But
without experiencing that data-wrangling work, she added, “you don’t
understand what is impeding progress in AI in the real world.”
Once
the car-image engine was built, its speed and predictive accuracy was
impressive. It successfully classified the cars in the 50 million images
in two weeks. That task would take a human expert, spending 10 seconds
per image, more than 15 years.
Identifying
so many car images in such detail was a technical feat. But it was
linking that new data set to public collections of socioeconomic and
environmental information, and then tweaking the software to spot
patterns and correlations, that makes the Stanford project part of what
computer scientists see as the broader application of image data.
“There
has been an explosion of computer vision research, but so far the
societal impact has been largely absent,” said Serge Belongie, a
computer scientist at Cornell Tech. “Being able to identify what is in a
photo is not science that advances our understanding of the world.”
The
Stanford car project generated a host of intriguing connections, not so
much startling revelations. In the most recent paper, and one published
earlier in the year by the Association for the Advancement of
Artificial Intelligence, these were among the predictive correlations:
—
The system was able to accurately predict income, race, education and
voting patterns at the ZIP code and precinct level in cities across the
country.
—
Car attributes (including mpg ratings) found that the greenest city in
America is Burlington, Vermont, while Casper, Wyoming, has the largest
per-capita carbon footprint.
—
Chicago is the city with the highest level of income segregation, with
large clusters of expensive and cheap cars in different neighborhoods;
Jacksonville, Florida, is the least segregated by income.
—
New York is the city with the most expensive cars. El Paso, Texas has
the highest percentage of Hummers. San Francisco has the highest
percentage of foreign cars.
Other
researchers have used Google Street View data for visual clues for
factors that influence urban development, ethnic shifts in local
communities and public health. But the Stanford project appears to have
used the most Street View images in the most detailed analysis so far.
The
significance of the project, experts say, is a proof of concept — that
new information can be gleaned from visual data with artificial
intelligence software and plenty of human help.
The
role of such research, they say, will be mainly to supplement
traditional information sources like the government’s American Community
Survey, the household surveys conducted by the Census Bureau.
This
kind of research, if it expands, will raise issues of data access and
privacy. The Stanford project only made predictions about neighborhoods,
not about individuals. But privacy concerns about Street View pictures
have been raised in Germany and elsewhere. Google says it handles
research requests for access to large amounts of its image data on a
case-by-case basis.
Onboard
cameras in cars are just beginning, as auto companies seek to develop
self-driving cars. Will some of the vast amounts of image data they
collect be available for research or kept proprietary?
Kenneth
Wachter, a professor of demography at the University of California,
Berkeley, said image-based studies could be a big help now that public
response rates to sample surveys are declining. An AI-assisted visual
census, he said, could fill in gaps in the current data, but also
provide more timely insights than the traditional census, conducted
every 10 years, on hot topics in public policy like “the geography and
evolution of disadvantage and opportunity.”
To
Nikhil Naik, a computer scientist and research fellow at Harvard, who
has used Street View images in the study of urban environments, the
Stanford project points toward the future of image-fueled research.
“For
the first time in history, we have the technology to extract insights
from very large amounts of visual data,” Naik said. “But while the
technology is exciting, computer scientists need to work closely with
social scientists and others to make sure it’s useful.”
No comments:
Write comments