Skip to main content

Exploring Nature Through Imageomics with Professor Tanya Berger-Wolf

By Erica Joo and Qining Wang

We recently spoke with Professor Tanya Berger-Wolf, a pioneer in the area of imageomics who is leading a team to start a new field of imageomics. She is a computational ecologist who is director and co-founder of the nonprofit organization “Wild Me.” Berger-Wolf is also the Director of the Translational Data Analytics Institute (TDAI) and a Professor of Computer Science Engineering, Electrical and Computer Engineering, as well as Evolution, Ecology, and Organismal Biology, at The Ohio State University.

Tanya Berger-Wolf

Observation is fundamental to any biological research. The development of optics technology, such as the inventions of the microscope and the telescope, allowed biologists to observe the world at different scales, from animals living in jungles of millions of acres to DNA in animal cells of several micrometers.

However, as Prof. Berger-Wolf pointed out, those inventions only serve to “augment our ability to look” or “look at more things more carefully.” We are still making observations and searching for patterns with our own eyes, from which arises the caveat: We are not so good at finding patterns when things appear to be random, or when patterns are rare, sparse, subtle, or complex. We can’t answer, for example, whether the stripe patterns of mother zebras are similar to their babies’. The patterns appear to be too similar and too random at the same time to our eyes because human brains did not evolve to “take [the stripe patterns] holistically and quantify them in any meaningful way.”

And that’s where imageomics comes in. Imageomics is following genomics, a field where researchers understand the biology of an organism or a species through their genetic information. In a similar vein, imageomics aims to understand nature through biological information extracted from images.

Computers are the perfect information extractors, because they “perceive” the world differently. Computers can quantify images down to pixels and find patterns that humans do not, or cannot, comprehend. Berger-Wolf pointed out that imageomics, as a “whole new field of science,” allows scientists to answer biological questions that weren’t answerable before because it provides scientists with a new way of observing nature.

The complementary vision of computers is especially prominent in the studies of biological traits, according to Berger-Wolf. Biological traits are the interplay between genes and the environment. They can be physical characteristics such as “beak colors, stripe patterns, fin curvatures, the curves of the belly or the back.” They can also be behavioral characteristics such as possums playing dead or pollen feeding in birds. Being able to observe traits “is the foundation of our understanding of how these traits are inherited and the understanding of genetics,” insights into animal behavior, and ecological and evolutionary theories.

In order for biologists to propose new evolutionary hypotheses to explain biological traits, it is crucial to “make these traits computable.” Starting from a project funded by the National Science Foundation, Berger-Wolf founded Wild Me. This nonprofit organization has an ongoing initiative, Wildbook, that collects images containing animals from numerous sources, including camera traps, drones, and even tourists’ social media posts on YouTube, Instagram, and Flickr.

Those source images serve as a starting point for a branch of research in imageomics, which will allow researchers to develop open software and artificial intelligence for the research community. Those tools would allow biologists to discern biological traits that are too similar or too subtle to their eyes, such as animal coat patterns or species that look alike yet are genomically different. Computer vision would allow scientists to find out whether traits are inheritable or shared by multiple species. Based on those new insights, biologists could then conjure new evolutionary hypotheses and start asking even more interesting questions, to which only imageomics can provide the answers.

Berger-Wolf jokes that she has “multiple research personality,” with a passion for bringing her diverse backgrounds together. By helping to found the new Imageomics Institute, her interests were able to converge. Participating in both worlds—natural and technical—allows her to see “the better way” of working and increasing effectiveness.

She commented that starting conversations between fields increases “mutual respect and understanding of each other’s questions and where we can come together.” Berger-Wolf sums up her career by describing her work as “creating tools that expand our ability to look at more things more carefully and even be able to ask questions that people have never been able to ask before.”

Berger-Wolf is currently working on several projects. One looks at animal coat patterns and correlates them with genetics, heritability, and the overall scientific structure of why some traits are inheritable and others are not. By using imageomics, we are able to understand at a deeper level since humans cannot pay attention to every detail. In another project, she is working on species-level traits of butterflies that mimic other species. Computer algorithms can identify what is similar and different in their appearances, down to the small details. Computers can extract complex information and people can start asking different questions using information normally beyond the scope of human perception.

Berger-Wolf’s recent award for the new Imageomics Institute under the NSF Harnessing the Data Revolution program is extending this work and bringing it to a wider audience. The images to be used as sources come from existing research projects, citizen scientists, organizations like iNaturalist, eBird, and Wild Me, as well as the digitization of the natural history museum collections through the iDigBio project.

There are various opportunities for students at any level and researchers from all over the world to participate in the field of imageomics. Berger-Wolf emphasized that the goal is to have people understand what imageomics is and how it’s significant so that it can be accessible to all.

“It’s not just an opportunity to advance science, but also to engage people in science,” she explains. Her team is built up of multiple researchers and students, sharing a goal of building a community around it. More direct community engagement, outreach events, and conferences are great ways for informing people about imageomics and how people can change the way traits are seen.

“We have incredible privilege to do science. To spend time answering scientific questions that are interesting to us while the public is paying us to do so. It’s important to tell the science to the public, communicate why, and what science brings to the world.”

Get Involved

New community-building activities facilitated by the Midwest Big Data Innovation Hub are continuing throughout 2022. Contact the Hub if you’re interested in participating, or are aware of other people or projects we should profile here. The MBDH has a variety of ways to get involved with our community and activities.

The Midwest Big Data Innovation Hub is an NSF-funded partnership of the University of Illinois at Urbana-Champaign, Iowa State University, Indiana University, the University of Michigan, the University of Minnesota, and the University of North Dakota, and is focused on developing collaborations in the 12-state Midwest region. Learn more about the national NSF Big Data Hubs community.

How Do Scientists Help AI Cope with a Messy Physical World?

By Qining Wang

When we see a stop sign at an intersection, we won’t mistake it for a yield sign. Our eyes recognize the white “STOP” letters printed on the red hexagon. It doesn’t matter if the sign is under sunlight or streetlight. It doesn’t matter if a tree branch gets in the way or someone puts graffiti and stickers on the sign. In other words, our eyes can perceive objects under different physical conditions.

A stop sign. Photo by Anwaar Ali.
Photo by Anwaar Ali via Unsplash

However, identifying road signs accurately is very different, if not more difficult, for artificial intelligence (AI). Even though, according to Alan Turning, AIs are systems that can “think like humans,” they can still present limitations in mimicking the human mind, depending on how they acquire their intelligence.

One of the potential hurdles is to correctly interpret variations in the physical environment. Such a limitation is commonly referred to as an “adversarial example.”

What Are Adversarial Examples?

Currently, the most common method to train an AI application is machine learning, a type of AI process that helps AI systems learn and improve from experience. Machine learning is like the driving class an AI needs to take before it can hit the road. Yet machine-learning-trained AIs are not immune to adversarial examples.

Circling back to reading the stop sign, an adversarial example could be the stop sign turning into a slightly darker shade of red at night. The machine-learning model captures these tiny color differences that human eyes cannot discern and might interpret the signs as something else. Another adversarial example could be a spam detector that fails to filter a spam email formatted like a normal email.

Just like how unpredictable individual human minds can be, it is also difficult to pinpoint the exact origin of what and why machine learning makes certain predictions. Neither is it a simple task to develop a machine-learning model that comprehends the messiness of a physical world. To improve the safety of self-driving cars and the quality of spam filters, data scientists are continuously tackling the vulnerabilities in the machine-learning processes that help AI applications “see” and “read” better.

What Are Humans Doing to Correct AI’s Mistakes?

To defend against adversarial examples, the most straightforward mechanism is to let machine-learning models analyze existing adversarial examples. For example, to help the AI of a self-driving car to recognize stop signs under different physical circumstances, we could expose the machine-learning model that controls the AI to pictures of stop signs under different lightings or at various distances and angles.

Google’s reCAPTCHA service is an example of such a defense. As an online safety measure, users need to click on images of traffic lights or road signs from a selection of pictures to prove that they are humans. What users might not be aware of is that they are also teaching the machine-learning model what different objects look like under different circumstances at the same time.

Alternatively, data scientists can improve AI by teaching them simulated adversarial examples during the machine-learning process. One way is to implement a Generative Adversarial Network (GAN).

GANs consist of two components: a generator and a discriminator. The generator “translates” a “real” input image from the training set (clean example) into an almost indistinguishable “fake” output image (adversarial example) by introducing random variations to the image. This “fake” image is then fed to the discriminator, where the discriminator tries to tell the modified and unmodified images apart.

The generator and the discriminator are inherently in competition: The generator strives to “fool” the discriminator, while the discriminator attempts to see through all its tricks. This cycle of fooling and being fooled repeats. Both become better at their own designated tasks over time. The cycle continues until the generator outcompetes the discriminator, creating adversarial examples that are indistinguishable to the discriminator. In the end, the generator is kept to defend against different types of real-life adversarial attacks.

AI Risks and Responses

GANs can be valuable tools to tackle adversarial examples in machine learning, but they can also serve malicious purposes. For instance, one other common application of GANs is face generation. This so-called “deepfake” makes it virtually impossible for humans to tell a real face from a GAN-generated face. Deepfakes could result in devastating consequences, such as corporate scams, social media manipulation, identity theft, or disinformation attacks, to name a few.

This shows how, as our physical lives become more and more entangled with our digital presence, we can never neglect the other side of the coin while enjoying the benefits brought to us by technological breakthroughs. Understanding both would serve as a starting point for practicing responsible AI principles and creating policies that enforce data ethics.

Tackling vulnerabilities in machine learning matters, and so does protecting ourselves and the community from the damage that those technologies could cause.

Learn More and Get Involved

Curious whether you can tell a real human face from a GAN-generated face? Check out this website. And keep an eye out for the Smart & Resilient Communities priority area of MBDH, if you wish to learn more about how data scientists use novel data science research to benefit communities in the Midwest. There are also several NSF-funded AI Institutes in the Midwest that are engaged in related research and education.

Contact the Midwest Big Data Innovation Hub if you’re aware of other people or projects we should profile here, or to participate in any of our community-led Priority Areas. The MBDH has a variety of ways to get involved with our community and activities.

The Midwest Big Data Innovation Hub is an NSF-funded partnership of the University of Illinois at Urbana-Champaign, Iowa State University, Indiana University, the University of Michigan, the University of Minnesota, and the University of North Dakota, and is focused on developing collaborations in the 12-state Midwest region. Learn more about the national NSF Big Data Hubs community.