Archive for Programming – Page 4

Generative AI: 5 essential reads about the new era of creativity, job anxiety, misinformation, bias and plagiarism

By Eric Smalley, The Conversation 

The light and dark sides of AI have been in the public spotlight for many years. Think facial recognition, algorithms making loan and sentencing recommendations, and medical image analysis. But the impressive – and sometimes scary – capabilities of ChatGPT, DALL-E 2 and other conversational and image-conjuring artificial intelligence programs feel like a turning point.

The key change has been the emergence within the last year of powerful generative AI, software that not only learns from vast amounts of data but also produces things – convincingly written documents, engaging conversation, photorealistic images and clones of celebrity voices.

Generative AI has been around for nearly a decade, as long-standing worries about deepfake videos can attest. Now, though, the AI models have become so large and have digested such vast swaths of the internet that people have become unsure of what AI means for the future of knowledge work, the nature of creativity and the origins and truthfulness of content on the internet.

Here are five articles from our archives the take the measure of this new generation of artificial intelligence.

1. Generative AI and work

A panel of five AI experts discussed the implications of generative AI for artists and knowledge workers. It’s not simply a matter of whether the technology will replace you or make you more productive.

University of Tennessee computer scientist Lynne Parker wrote that while there are significant benefits to generative AI, like making creativity and knowledge work more accessible, the new tools also have downsides. Specifically, they could lead to an erosion of skills like writing, and they raise issues of intellectual property protections given that the models are trained on human creations.

University of Colorado Boulder computer scientist Daniel Acuña has found the tools to be useful in his own creative endeavors but is concerned about inaccuracy, bias and plagiarism.

University of Michigan computer scientist Kentaro Toyama wrote that human skill is likely to become costly and extraneous in some fields. “If history is any guide, it’s almost certain that advances in AI will cause more jobs to vanish, that creative-class people with human-only skills will become richer but fewer in number, and that those who own creative technology will become the new mega-rich.”

Florida International University computer scientist Mark Finlayson wrote that some jobs are likely to disappear, but that new skills in working with these AI tools are likely to become valued. By analogy, he noted that the rise of word processing software largely eliminated the need for typists but allowed nearly anyone with access to a computer to produce typeset documents and led to a new class of skills to list on a resume.

University of Colorado Anschutz biomedical informatics researcher Casey Greene wrote that just as Google led people to develop skills in finding information on the internet, AI language models will lead people to develop skills to get the best output from the tools. “As with many technological advances, how people interact with the world will change in the era of widely accessible AI models. The question is whether society will use this moment to advance equity or exacerbate disparities.”

2. Conjuring images from words

Generative AI can seem like magic. It’s hard to imagine how image-generating AIs can take a few words of text and produce an image that matches the words.

Hany Farid, a University of California, Berkeley computer scientist who specializes in image forensics, explained the process. The software is trained on a massive set of images, each of which includes a short text description.

“The model progressively corrupts each image until only visual noise remains, and then trains a neural network to reverse this corruption. Repeating this process hundreds of millions of times, the model learns how to convert pure noise into a coherent image from any caption,” he wrote.

3. Marking the machine

Many of the images produced by generative AI are difficult to distinguish from photographs, and AI-generated video is rapidly improving. This raises the stakes for combating fraud and misinformation. Fake videos of corporate executives could be used to manipulate stock prices, and fake videos of political leaders could be used to spread dangerous misinformation.

Farid explained how it’s possible to produce AI-generated photos and video that contain watermarks verifying that they are synthetic. The trick is to produce digital watermarks that can’t be altered or removed. “These watermarks can be baked into the generative AI systems by watermarking all the training data, after which the generated content will contain the same watermark,” he wrote.

4. Flood of ideas

For all the legitimate concern about the downsides of generative AI, the tools are proving to be useful for some artists, designers and writers. People in creative fields can use the image generators to quickly sketch out ideas, including unexpected off-the-wall material.

AI as an idea generator for designers.

Rochester Institute of Technology industrial designer and professor Juan Noguera and his students use tools like DALL-E or Midjourney to produce thousands of images from abstract ideas – a sort of sketchbook on steroids.

“Enter any sentence – no matter how crazy – and you’ll receive a set of unique images generated just for you. Want to design a teapot? Here, have 1,000 of them,” he wrote. “While only a small subset of them may be usable as a teapot, they provide a seed of inspiration that the designer can nurture and refine into a finished product.”

5. Shortchanging the creative process

However, using AI to produce finished artworks is another matter, according to Nir Eisikovits and Alec Stubbs, philosophers at the Applied Ethics Center at University of Massachusetts Boston. They note that the process of making art is more than just coming up with ideas.

The hands-on process of producing something, iterating the process and making refinements – often in the moment in response to audience reactions – are indispensable aspects of creating art, they wrote.

“It is the work of making something real and working through its details that carries value, not simply that moment of imagining it,” they wrote. “Artistic works are lauded not merely for the finished product, but for the struggle, the playful interaction and the skillful engagement with the artistic task, all of which carry the artist from the moment of inception to the end result.”

Editor’s note: This story is a roundup of articles from The Conversation’s archives.The Conversation

About the Author:

Eric Smalley, Science + Technology Editor, The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.

 

What are passkeys? A cybersecurity researcher explains how you can use your phone to make passwords a thing of the past

By Sayonnha Mandal, University of Nebraska Omaha 

Passwords could soon become passé.

Effective passwords are cumbersome, all the more so when reinforced by two-factor authentication. But the need for authentication and secure access to websites is as great as ever. Enter passkeys.

Passkeys are digital credentials stored on your phone or computer. They are analogous to physical keys. You access your passkey by signing in to your device using a personal identification number (PIN), swipe pattern or biometrics like fingerprint or face recognition. You set your online accounts to trust your phone or computer. To break into your accounts, a hacker would need to physically possess your device and have the means to sign in to it.

As a cybersecurity researcher, I believe that passkeys not only provide faster, easier and more secure sign-ins, they minimize human error in password security and authorization steps. You don’t need to remember passwords for every account and don’t need to use two-factor authentication.

How passkeys work

Passkeys are generated via public-key cryptography. They use a public-private key pair to ensure a mathematically protected private relationship between users’ devices and the online accounts being accessed. It would be nearly impossible for a hacker to guess the passkey – hence the need to physically possess the device the passkey is accessed from.

Passkeys consist of a long private key – a long string of encrypted characters – created for a specific device. Websites cannot access the value of the passkey. Rather, the passkey verifies that a website possesses the corresponding public key. You can use the passkey from one device to access a website using another device. For example, you can use your laptop to access a website using the passkey on your phone by authorizing the login from your phone. And if you lose your phone, the passkey can be stored securely in the cloud with the phone’s other data, which can be restored to a new phone.

Passkeys explained in 76 seconds.

Why passkeys matter

Passwords can be guessed, phished or otherwise stolen. Security experts advise users to make their passwords longer with more characters, mixing alphanumeric and special symbols. A good password should not be in the dictionary or in phrases, have no consecutive letters or numbers, but be memorable. Users should not share them with anyone. Last but not least, users should change passwords every six months at minimum for all devices and accounts. Using a password manager to remember and update strong passwords helps but can still be a nuisance.

Even if you follow all of the best practices to keep your passwords safe, there is no guarantee of airtight security. Hackers are continuously developing and using software exploits, hardware tools and ever-advancing algorithms to break these defenses. Cybersecurity experts and malicious hackers are locked in an arms race.

Passkeys remove the onus from the user to create, remember and guard all their passwords. Apple, Google and Microsoft are supporting passkeys and encourage users to use them instead of passwords. As a result, passkeys are likely to soon overtake passwords and password managers in the cybersecurity battlefield.

However, it will take time for websites to add support for passkeys, so passwords aren’t going to go extinct overnight. IT managers still recommend that people use a password manager like 1Password or Bitwarden. And even Apple, which is encouraging the adoption of passkeys, has its own password manager.The Conversation

About the Author:

Sayonnha Mandal, Lecturer in Interdisciplinary Informatics, University of Nebraska Omaha

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Don’t bet with ChatGPT – study shows language AIs often make irrational decisions

By Mayank Kejriwal, University of Southern California 

The past few years have seen an explosion of progress in large language model artificial intelligence systems that can do things like write poetry, conduct humanlike conversations and pass medical school exams. This progress has yielded models like ChatGPT that could have major social and economic ramifications ranging from job displacements and increased misinformation to massive productivity boosts.

Despite their impressive abilities, large language models don’t actually think. They tend to make elementary mistakes and even make things up. However, because they generate fluent language, people tend to respond to them as though they do think. This has led researchers to study the models’ “cognitive” abilities and biases, work that has grown in importance now that large language models are widely accessible.

This line of research dates back to early large language models such as Google’s BERT, which is integrated into its search engine and so has been coined BERTology. This research has already revealed a lot about what such models can do and where they go wrong.

For instance, cleverly designed experiments have shown that many language models have trouble dealing with negation – for example, a question phrased as “what is not” – and doing simple calculations. They can be overly confident in their answers, even when wrong. Like other modern machine learning algorithms, they have trouble explaining themselves when asked why they answered a certain way.

People make irrational decisions, too, but humans have emotions and cognitive shortcuts as excuses.

Words and thoughts

Inspired by the growing body of research in BERTology and related fields like cognitive science, my student Zhisheng Tang and I set out to answer a seemingly simple question about large language models: Are they rational?

Although the word rational is often used as a synonym for sane or reasonable in everyday English, it has a specific meaning in the field of decision-making. A decision-making system – whether an individual human or a complex entity like an organization – is rational if, given a set of choices, it chooses to maximize expected gain.

The qualifier “expected” is important because it indicates that decisions are made under conditions of significant uncertainty. If I toss a fair coin, I know that it will come up heads half of the time on average. However, I can’t make a prediction about the outcome of any given coin toss. This is why casinos are able to afford the occasional big payout: Even narrow house odds yield enormous profits on average.

On the surface, it seems odd to assume that a model designed to make accurate predictions about words and sentences without actually understanding their meanings can understand expected gain. But there is an enormous body of research showing that language and cognition are intertwined. An excellent example is seminal research done by scientists Edward Sapir and Benjamin Lee Whorf in the early 20th century. Their work suggested that one’s native language and vocabulary can shape the way a person thinks.

The extent to which this is true is controversial, but there is supporting anthropological evidence from the study of Native American cultures. For instance, speakers of the Zuñi language spoken by the Zuñi people in the American Southwest, which does not have separate words for orange and yellow, are not able to distinguish between these colors as effectively as speakers of languages that do have separate words for the colors.

Making a bet

So are language models rational? Can they understand expected gain? We conducted a detailed set of experiments to show that, in their original form, models like BERT behave randomly when presented with betlike choices. This is the case even when we give it a trick question like: If you toss a coin and it comes up heads, you win a diamond; if it comes up tails, you lose a car. Which would you take? The correct answer is heads, but the AI models chose tails about half the time.

screenshot of text dialogue
ChatGPT is not clear on the concept of gains and losses.
ChatGPT dialogue by Mayank Kejriwal, CC BY-ND

Intriguingly, we found that the model can be taught to make relatively rational decisions using only a small set of example questions and answers. At first blush, this would seem to suggest that the models can indeed do more than just “play” with language. Further experiments, however, showed that the situation is actually much more complex. For instance, when we used cards or dice instead of coins to frame our bet questions, we found that performance dropped significantly, by over 25%, although it stayed above random selection.

So the idea that the model can be taught general principles of rational decision-making remains unresolved, at best. More recent case studies that we conducted using ChatGPT confirm that decision-making remains a nontrivial and unsolved problem even for much bigger and more advanced large language models.

Getting the decision right

This line of study is important because rational decision-making under conditions of uncertainty is critical to building systems that understand costs and benefits. By balancing expected costs and benefits, an intelligent system might have been able to do better than humans at planning around the supply chain disruptions the world experienced during the COVID-19 pandemic, managing inventory or serving as a financial adviser.

Our work ultimately shows that if large language models are used for these kinds of purposes, humans need to guide, review and edit their work. And until researchers figure out how to endow large language models with a general sense of rationality, the models should be treated with caution, especially in applications requiring high-stakes decision-making.The Conversation

About the Author:

Mayank Kejriwal, Research Assistant Professor of Industrial & Systems Engineering, University of Southern California

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Regulating AI: 3 experts explain why it’s difficult to do and important to get right

By S. Shyam Sundar, Penn State; Cason Schmit, Texas A&M University, and John Villasenor, University of California, Los Angeles 

From fake photos of Donald Trump being arrested by New York City police officers to a chatbot describing a very-much-alive computer scientist as having died tragically, the ability of the new generation of generative artificial intelligence systems to create convincing but fictional text and images is setting off alarms about fraud and misinformation on steroids. Indeed, a group of artificial intelligence researchers and industry figures urged the industry on March 29, 2023, to pause further training of the latest AI technologies or, barring that, for governments to “impose a moratorium.”

These technologies – image generators like DALL-E, Midjourney and Stable Diffusion, and text generators like Bard, ChatGPT, Chinchilla and LLaMA – are now available to millions of people and don’t require technical knowledge to use.

Given the potential for widespread harm as technology companies roll out these AI systems and test them on the public, policymakers are faced with the task of determining whether and how to regulate the emerging technology. The Conversation asked three experts on technology policy to explain why regulating AI is such a challenge – and why it’s so important to get it right.

To jump ahead to each response, here’s a list of each:


Human foibles and a moving target
Combining “soft” and “hard” approaches
Four key questions to ask


 

Human foibles and a moving target

S. Shyam Sundar, Professor of Media Effects & Director, Center for Socially Responsible AI, Penn State

The reason to regulate AI is not because the technology is out of control, but because human imagination is out of proportion. Gushing media coverage has fueled irrational beliefs about AI’s abilities and consciousness. Such beliefs build on “automation bias” or the tendency to let your guard down when machines are performing a task. An example is reduced vigilance among pilots when their aircraft is flying on autopilot.

Numerous studies in my lab have shown that when a machine, rather than a human, is identified as a source of interaction, it triggers a mental shortcut in the minds of users that we call a “machine heuristic.” This shortcut is the belief that machines are accurate, objective, unbiased, infallible and so on. It clouds the user’s judgment and results in the user overly trusting machines. However, simply disabusing people of AI’s infallibility is not sufficient, because humans are known to unconsciously assume competence even when the technology doesn’t warrant it.

Research has also shown that people treat computers as social beings when the machines show even the slightest hint of humanness, such as the use of conversational language. In these cases, people apply social rules of human interaction, such as politeness and reciprocity. So, when computers seem sentient, people tend to trust them, blindly. Regulation is needed to ensure that AI products deserve this trust and don’t exploit it.

AI poses a unique challenge because, unlike in traditional engineering systems, designers cannot be sure how AI systems will behave. When a traditional automobile was shipped out of the factory, engineers knew exactly how it would function. But with self-driving cars, the engineers can never be sure how it will perform in novel situations.

Lately, thousands of people around the world have been marveling at what large generative AI models like GPT-4 and DALL-E 2 produce in response to their prompts. None of the engineers involved in developing these AI models could tell you exactly what the models will produce. To complicate matters, such models change and evolve with more and more interaction.

All this means there is plenty of potential for misfires. Therefore, a lot depends on how AI systems are deployed and what provisions for recourse are in place when human sensibilities or welfare are hurt. AI is more of an infrastructure, like a freeway. You can design it to shape human behaviors in the collective, but you will need mechanisms for tackling abuses, such as speeding, and unpredictable occurrences, like accidents.

AI developers will also need to be inordinately creative in envisioning ways that the system might behave and try to anticipate potential violations of social standards and responsibilities. This means there is a need for regulatory or governance frameworks that rely on periodic audits and policing of AI’s outcomes and products, though I believe that these frameworks should also recognize that the systems’ designers cannot always be held accountable for mishaps.

Artificial intelligence researcher Joanna Bryson describes how professional organizations can play a role in regulating AI.

 

Combining ‘soft’ and ‘hard’ approaches

Cason Schmit, Assistant Professor of Public Health, Texas A&M University

Regulating AI is tricky. To regulate AI well, you must first define AI and understand anticipated AI risks and benefits.
Legally defining AI is important to identify what is subject to the law. But AI technologies are still evolving, so it is hard to pin down a stable legal definition.

Understanding the risks and benefits of AI is also important. Good regulations should maximize public benefits while minimizing risks. However, AI applications are still emerging, so it is difficult to know or predict what future risks or benefits might be. These kinds of unknowns make emerging technologies like AI extremely difficult to regulate with traditional laws and regulations.

Lawmakers are often too slow to adapt to the rapidly changing technological environment. Some new laws are obsolete by the time they are enacted or even introduced. Without new laws, regulators have to use old laws to address new problems. Sometimes this leads to legal barriers for social benefits or legal loopholes for harmful conduct.

Soft laws” are the alternative to traditional “hard law” approaches of legislation intended to prevent specific violations. In the soft law approach, a private organization sets rules or standards for industry members. These can change more rapidly than traditional lawmaking. This makes soft laws promising for emerging technologies because they can adapt quickly to new applications and risks. However, soft laws can mean soft enforcement.

Megan Doerr, Jennifer Wagner and I propose a third way: Copyleft AI with Trusted Enforcement (CAITE). This approach combines two very different concepts in intellectual property — copyleft licensing and patent trolls.

Copyleft licensing allows for content to be used, reused or modified easily under the terms of a license – for example, open-source software. The CAITE model uses copyleft licenses to require AI users to follow specific ethical guidelines, such as transparent assessments of the impact of bias.

In our model, these licenses also transfer the legal right to enforce license violations to a trusted third party. This creates an enforcement entity that exists solely to enforce ethical AI standards and can be funded in part by fines from unethical conduct. This entity is like a patent troll in that it is private rather than governmental and it supports itself by enforcing the legal intellectual property rights that it collects from others. In this case, rather than enforcement for profit, the entity enforces the ethical guidelines defined in the licenses – a “troll for good.”

This model is flexible and adaptable to meet the needs of a changing AI environment. It also enables substantial enforcement options like a traditional government regulator. In this way, it combines the best elements of hard and soft law approaches to meet the unique challenges of AI.

Though generative AI has been grabbing headlines of late, other types of AI have been posing challenges for regulators for years, particularly in the area of data privacy.

 

Four key questions to ask

John Villasenor, Professor of Electrical Engineering, Law, Public Policy, and Management, University of California, Los Angeles

The extraordinary recent advances in large language model-based generative AI are spurring calls to create new AI-specific regulation. Here are four key questions to ask as that dialogue progresses:

1) Is new AI-specific regulation necessary? Many of the potentially problematic outcomes from AI systems are already addressed by existing frameworks. If an AI algorithm used by a bank to evaluate loan applications leads to racially discriminatory loan decisions, that would violate the Fair Housing Act. If the AI software in a driverless car causes an accident, products liability law provides a framework for pursuing remedies.

2) What are the risks of regulating a rapidly changing technology based on a snapshot of time? A classic example of this is the Stored Communications Act, which was enacted in 1986 to address then-novel digital communication technologies like email. In enacting the SCA, Congress provided substantially less privacy protection for emails more than 180 days old.

The logic was that limited storage space meant that people were constantly cleaning out their inboxes by deleting older messages to make room for new ones. As a result, messages stored for more than 180 days were deemed less important from a privacy standpoint. It’s not clear that this logic ever made sense, and it certainly doesn’t make sense in the 2020s, when the majority of our emails and other stored digital communications are older than six months.

A common rejoinder to concerns about regulating technology based on a single snapshot in time is this: If a law or regulation becomes outdated, update it. But this is easier said than done. Most people agree that the SCA became outdated decades ago. But because Congress hasn’t been able to agree on specifically how to revise the 180-day provision, it’s still on the books over a third of a century after its enactment.

3) What are the potential unintended consequences? The Allow States and Victims to Fight Online Sex Trafficking Act of 2017 was a law passed in 2018 that revised Section 230 of the Communications Decency Act with the goal of combating sex trafficking. While there’s little evidence that it has reduced sex trafficking, it has had a hugely problematic impact on a different group of people: sex workers who used to rely on the websites knocked offline by FOSTA-SESTA to exchange information about dangerous clients. This example shows the importance of taking a broad look at the potential effects of proposed regulations.

4) What are the economic and geopolitical implications? If regulators in the United States act to intentionally slow the progress in AI, that will simply push investment and innovation — and the resulting job creation — elsewhere. While emerging AI raises many concerns, it also promises to bring enormous benefits in areas including education, medicine, manufacturing, transportation safety, agriculture, weather forecasting, access to legal services and more.

I believe AI regulations drafted with the above four questions in mind will be more likely to successfully address the potential harms of AI while also ensuring access to its benefits.The Conversation

About the Author:

S. Shyam Sundar, James P. Jimirro Professor of Media Effects, Co-Director, Media Effects Research Laboratory, & Director, Center for Socially Responsible AI, Penn State; Cason Schmit, Assistant Professor of Public Health, Texas A&M University, and John Villasenor, Professor of Electrical Engineering, Law, Public Policy, and Management, University of California, Los Angeles

This article is republished from The Conversation under a Creative Commons license. Read the original article.

 

Scientists are using machine learning to forecast bird migration and identify birds in flight by their calls

By Miguel Jimenez, Colorado State University

With chatbots like ChatGPT making a splash, machine learning is playing an increasingly prominent role in our lives. For many of us, it’s been a mixed bag. We rejoice when our Spotify For You playlist finds us a new jam, but groan as we scroll through a slew of targeted ads on our Instagram feeds.

Machine learning is also changing many fields that may seem surprising. One example is my discipline, ornithology – the study of birds. It isn’t just solving some of the biggest challenges associated with studying bird migration; more broadly, machine learning is expanding the ways in which people engage with birds. As spring migration picks up, here’s a look at how machine learning is influencing ways to research birds and, ultimately, to protect them.

Sandhill cranes flying above the Platte River in Nebraska.
shannonpatrick17/Flickr, CC BY

The challenge of conserving migratory birds

Most birds in the Western Hemisphere migrate twice a year, flying over entire continents between their breeding and nonbreeding grounds. While these journeys are awe-inspiring, they expose birds to many hazards en route, including extreme weather, food shortages and light pollution that can attract birds and cause them to collide with buildings.

Our ability to protect migratory birds is only as good as the science that tells us where they go. And that science has come a long way.

People in Alaska, Washington state and Mexico explain what migratory birds mean to them.

In 1920, the U.S. Geological Survey launched the Bird Banding Laboratory, spearheading an effort to put bands with unique markers on birds, then recapture the birds in new places to figure out where they traveled. Today researchers can deploy a variety of lightweight tracking tags on birds to discover their migration routes. These tools have uncovered the spatial patterns of where and when birds of many species migrate.

However, tracking birds has limitations. For one thing, over 4 billion birds migrate across the continent every year. Even with increasingly affordable equipment, the number of birds that we track is a drop in the bucket. And even within a species, migratory behavior may vary across sexes or populations.

Further, tracking data tells us where birds have been, but it doesn’t necessarily tell us where they’re going. Migration is dynamic, and the climates and landscapes that birds fly through are constantly changing. That means it’s crucial to be able to predict their movements.

Using machine learning to forecast migration

This is where machine learning comes in. Machine learning is a subfield of artificial intelligence that gives computers the ability to learn tasks or associations without explicitly being programmed. We use it to train algorithms that tackle various tasks, from forecasting weather to predicting March Madness upsets.

But applying machine learning requires data – and the more data the better. Luckily, scientists have inadvertently compiled decades of data on migrating birds through the Next Generation Weather Radar system. This network, known as NEXRAD, is used to measure weather dynamics and help predict future weather events, but it also picks up signals from birds as they fly through the atmosphere.

A tall metal tower with a spherical radar receiver on top.
A NEXRAD radar at an operation center in Norman, Okla.
Andrew J. Oldaker/Wikipedia, CC BY-SA

BirdCast is a collaborative project of Colorado State University, the Cornell Lab of Ornithology and the University of Massachusetts that seeks to leverage that data to quantify bird migration. Machine learning is central to its operations. Researchers have known since the 1940s that birds show up on weather radar, but to make that data useful, we need to remove nonavian clutter and identify which scans contain bird movement.

This process would be painstaking by hand – but by training algorithms to identify bird activity, we have automated this process and unlocked decades of migration data. And machine learning allows the BirdCast team to take things further: By training an algorithm to learn what atmospheric conditions are associated with migration, we can use predicted conditions to produce forecasts of migration across the continental U.S.

BirdCast began broadcasting these forecasts in 2018 and has become a popular tool in the birding community. Many users may recognize that radar data helps produce these forecasts, but fewer realize that it’s a product of machine learning.

BirdCast provides summaries of radar-based measurements of nocturnal bird migration for the continental U.S., including estimates of numbers of birds migrating and their directions, speeds and altitudes.

Currently these forecasts can’t tell us what species are in the air, but that could be changing. Last year, researchers at the Cornell Lab of Ornithology published an automated system that uses machine learning to detect and identify nocturnal flight calls. These are species-specific calls that birds make while migrating. Integrating this approach with BirdCast could give us a more complete picture of migration.

These advancements exemplify how effective machine learning can be when guided by expertise in the field where it is being applied. As a doctoral student, I joined Colorado State University’s Aeroecology Lab with a strong ornithology background but no machine learning experience. Conversely, Ali Khalighifar, a postdoctoral researcher in our lab, has a background in machine learning but has never taken an ornithology class.

Together, we are working to enhance the models that make BirdCast run, often leaning on each other’s insights to move the project forward. Our collaboration typifies the convergence that allows us to use machine learning effectively.

A tool for public engagement

Machine learning is also helping scientists engage the public in conservation. For example, forecasts produced by the BirdCast team are often used to inform Lights Out campaigns.

These initiatives seek to reduce artificial light from cities, which attracts migrating birds and increases their chances of colliding with human-built structures, such as buildings and communication towers. Lights Out campaigns can mobilize people to help protect birds at the flip of a switch.

As another example, the Merlin bird identification app seeks to create technology that makes birding easier for everyone. In 2021, the Merlin staff released a feature that automates song and call identification, allowing users to identify what they’re hearing in real time, like an ornithological version of Shazam.

This feature has opened the door for millions of people to engage with their natural spaces in a new way. Machine learning is a big part of what made it possible.

“Sound ID is our biggest success in terms of replicating the magical experience of going birding with a skilled naturalist,” Grant Van Horn, a staff researcher at the Cornell Lab of Ornithology who helped develop the algorithm behind this feature, told me.

Taking flight

Opportunities for applying machine learning in ornithology will only increase. As billions of birds migrate over North America to their breeding grounds this spring, people will engage with these flights in new ways, thanks to projects like BirdCast and Merlin. But that engagement is reciprocal: The data that birders collect will open new opportunities for applying machine learning.

Computers can’t do this work themselves. “Any successful machine learning project has a huge human component to it. That is the reason these projects are succeeding,” Van Horn said to me.The Conversation

About the Author:

Miguel Jimenez, Ph.D. student in Ecology, Colorado State University

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Watermarking ChatGPT, DALL-E and other generative AIs could help protect against fraud and misinformation

By Hany Farid, University of California, Berkeley 

Shortly after rumors leaked of former President Donald Trump’s impending indictment, images purporting to show his arrest appeared online. These images looked like news photos, but they were fake. They were created by a generative artificial intelligence system.

Generative AI, in the form of image generators like DALL-E, Midjourney and Stable Diffusion, and text generators like Bard, ChatGPT, Chinchilla and LLaMA, has exploded in the public sphere. By combining clever machine-learning algorithms with billions of pieces of human-generated content, these systems can do anything from create an eerily realistic image from a caption, synthesize a speech in President Joe Biden’s voice, replace one person’s likeness with another in a video, or write a coherent 800-word op-ed from a title prompt.

Even in these early days, generative AI is capable of creating highly realistic content. My colleague Sophie Nightingale and I found that the average person is unable to reliably distinguish an image of a real person from an AI-generated person. Although audio and video have not yet fully passed through the uncanny valley – images or models of people that are unsettling because they are close to but not quite realistic – they are likely to soon. When this happens, and it is all but guaranteed to, it will become increasingly easier to distort reality.

In this new world, it will be a snap to generate a video of a CEO saying her company’s profits are down 20%, which could lead to billions in market-share loss, or to generate a video of a world leader threatening military action, which could trigger a geopolitical crisis, or to insert the likeness of anyone into a sexually explicit video.

The technology to make fake videos of real people is becoming increasingly available.

Advances in generative AI will soon mean that fake but visually convincing content will proliferate online, leading to an even messier information ecosystem. A secondary consequence is that detractors will be able to easily dismiss as fake actual video evidence of everything from police violence and human rights violations to a world leader burning top-secret documents.

As society stares down the barrel of what is almost certainly just the beginning of these advances in generative AI, there are reasonable and technologically feasible interventions that can be used to help mitigate these abuses. As a computer scientist who specializes in image forensics, I believe that a key method is watermarking.

Watermarks

There is a long history of marking documents and other items to prove their authenticity, indicate ownership and counter counterfeiting. Today, Getty Images, a massive image archive, adds a visible watermark to all digital images in their catalog. This allows customers to freely browse images while protecting Getty’s assets.

Imperceptible digital watermarks are also used for digital rights management. A watermark can be added to a digital image by, for example, tweaking every 10th image pixel so that its color (typically a number in the range 0 to 255) is even-valued. Because this pixel tweaking is so minor, the watermark is imperceptible. And, because this periodic pattern is unlikely to occur naturally, and can easily be verified, it can be used to verify an image’s provenance.

Even medium-resolution images contain millions of pixels, which means that additional information can be embedded into the watermark, including a unique identifier that encodes the generating software and a unique user ID. This same type of imperceptible watermark can be applied to audio and video.

The ideal watermark is one that is imperceptible and also resilient to simple manipulations like cropping, resizing, color adjustment and converting digital formats. Although the pixel color watermark example is not resilient because the color values can be changed, many watermarking strategies have been proposed that are robust – though not impervious – to attempts to remove them.

Watermarking and AI

These watermarks can be baked into the generative AI systems by watermarking all the training data, after which the generated content will contain the same watermark. This baked-in watermark is attractive because it means that generative AI tools can be open-sourced – as the image generator Stable Diffusion is – without concerns that a watermarking process could be removed from the image generator’s software. Stable Diffusion has a watermarking function, but because it’s open source, anyone can simply remove that part of the code.

OpenAI is experimenting with a system to watermark ChatGPT’s creations. Characters in a paragraph cannot, of course, be tweaked like a pixel value, so text watermarking takes on a different form.

Text-based generative AI is based on producing the next most-reasonable word in a sentence. For example, starting with the sentence fragment “an AI system can…,” ChatGPT will predict that the next word should be “learn,” “predict” or “understand.” Associated with each of these words is a probability corresponding to the likelihood of each word appearing next in the sentence. ChatGPT learned these probabilities from the large body of text it was trained on.

Generated text can be watermarked by secretly tagging a subset of words and then biasing the selection of a word to be a synonymous tagged word. For example, the tagged word “comprehend” can be used instead of “understand.” By periodically biasing word selection in this way, a body of text is watermarked based on a particular distribution of tagged words. This approach won’t work for short tweets but is generally effective with text of 800 or more words depending on the specific watermark details.

Generative AI systems can, and I believe should, watermark all their content, allowing for easier downstream identification and, if necessary, intervention. If the industry won’t do this voluntarily, lawmakers could pass regulation to enforce this rule. Unscrupulous people will, of course, not comply with these standards. But, if the major online gatekeepers – Apple and Google app stores, Amazon, Google, Microsoft cloud services and GitHub – enforce these rules by banning noncompliant software, the harm will be significantly reduced.

Signing authentic content

Tackling the problem from the other end, a similar approach could be adopted to authenticate original audiovisual recordings at the point of capture. A specialized camera app could cryptographically sign the recorded content as it’s recorded. There is no way to tamper with this signature without leaving evidence of the attempt. The signature is then stored on a centralized list of trusted signatures.

Although not applicable to text, audiovisual content can then be verified as human-generated. The Coalition for Content Provenance and Authentication (C2PA), a collaborative effort to create a standard for authenticating media, recently released an open specification to support this approach. With major institutions including Adobe, Microsoft, Intel, BBC and many others joining this effort, the C2PA is well positioned to produce effective and widely deployed authentication technology.

The combined signing and watermarking of human-generated and AI-generated content will not prevent all forms of abuse, but it will provide some measure of protection. Any safeguards will have to be continually adapted and refined as adversaries find novel ways to weaponize the latest technologies.

In the same way that society has been fighting a decadeslong battle against other cyber threats like spam, malware and phishing, we should prepare ourselves for an equally protracted battle to defend against various forms of abuse perpetrated using generative AI.The Conversation

About the Author:

Hany Farid, Professor of Computer Science, University of California, Berkeley

This article is republished from The Conversation under a Creative Commons license. Read the original article.

How to use free satellite data to monitor natural disasters and environmental changes

By Qiusheng Wu, University of Tennessee 

If you want to track changes in the Amazon rainforest, see the full expanse of a hurricane or figure out where people need help after a disaster, it’s much easier to do with the view from a satellite orbiting a few hundred miles above Earth.

Over 8,000 satellites are orbiting Earth today, capturing images like this, of the Louisiana coast.
NASA Earth Observatory

Traditionally, access to satellite data has been limited to researchers and professionals with expertise in remote sensing and image processing. However, the increasing availability of open-access data from government satellites such as Landsat and Sentinel, and free cloud-computing resources such as Amazon Web Services, Google Earth Engine and Microsoft Planetary Computer, have made it possible for just about anyone to gain insight into environmental changes underway.

I work with geospatial big data as a professor. Here’s a quick tour of where you can find satellite images, plus some free, fairly simple tools that anyone can use to create time-lapse animations from satellite images.

For example, state and urban planners – or people considering a new home – can watch over time how rivers have moved, construction crept into wildland areas or a coastline eroded.

A squiggly river moves surprisingly quickly over time.
Landsat time-lapse animations show the river dynamics in Pucallpa, Peru.
Qiusheng Wu, NASA Landsat
Animation shows the shoreline shrinking.
A Landsat time-lapse shows the shoreline retreat in the Parc Natural del Delta, Spain.
Qiusheng Wu, NASA Landsat

Environmental groups can monitor deforestation, the effects of climate change on ecosystems, and how other human activities like irrigation are shrinking bodies of water like Central Asia’s Aral Sea. And disaster managers, aid groups, scientists and anyone interested can monitor natural disasters such as volcanic eruptions and wildfires.

The lake, created by damming the river, has shrunk over time.
GOES images show the decline of the crucial Colorado River reservoir Lake Mead since the 1980s and the growth of neighboring Las Vegas.
Qiusheng Wu, NOAA GOES
A volcanic eruption bursts into view.
A GOES satellite time-lapse shows the Hunga Tonga volcanic eruption on Jan. 15, 2022.
Qiusheng Wu, NOAA GOES

Putting Landsat and Sentinel to work

There are over 8,000 satellites orbiting the Earth today. You can see a live map of them at keeptrack.space.

Some transmit and receive radio signals for communications. Others provide global positioning system (GPS) services for navigation. The ones we’re interested in are Earth observation satellites, which collect images of the Earth, day and night.

Landsat: The longest-running Earth satellite mission, Landsat, has been collecting imagery of the Earth since 1972. The latest satellite in the series, Landsat 9, was launched by NASA in September 2021.

In general, Landsat satellite data has a spatial resolution of about 100 feet (about 30 meters). If you think of pixels on a zoomed-in photo, each pixel would be 100 feet by 100 feet. Landsat has a temporal resolution of 16 days, meaning the same location on Earth is imaged approximately once every 16 days. With both Landsat 8 and 9 in orbit, we can get a global coverage of the Earth once every eight days. That makes comparisons easier.

Landsat data has been freely available to the public since 2008. During the Pakistan flood of 2022, scientists used Landsat data and free cloud-computing resources to determine the flood extent and estimated the total flooded area.

Images show how the flood covered about a third of Pakistan.
Landsat satellite images showing a side-by-side comparison of southern Pakistan in August 2021 (one year before the floods) and August 2022 (right)
Qiusheng Wu, NASA Landsat

Sentinel: Sentinel Earth observation satellites were launched by the European Space Agency (ESA) as part of the Copernicus program. Sentinel-2 satellites have been collecting optical imagery of the Earth since 2015 at a spatial resolution of 10 meters (33 feet) and a temporal resolution of 10 days.

GOES: The images you’ll see most often in U.S. weather forecasting come from NOAA’s Geostationary Operational Environmental Satellites, or GOES. They orbit above the equator at the same speed Earth rotates, so they can provide continuous monitoring of Earth’s atmosphere and surface, giving detailed information on weather, climate, and other environmental conditions. GOES-16 and GOES-17 can image the Earth at a spatial resolution of about 1.2 miles (2 kilometers) and a temporal resolution of five to 10 minutes.

Animation showing swirling clouds off the coast and the long river of moisture headed for California.
A GOES satellite shows an atmospheric river arriving on the West Coast in 2021.
Qiusheng Wu, GOES

How to create your own visualizations

In the past, creating a Landsat time-lapse animation of a specific area required extensive data processing skills and several hours or even days of work. However, nowadays, free and user-friendly programs are available to enable anyone to create animations with just a few clicks in an internet browser.

For instance, I created an interactive web app for my students that anyone can use to generate time-lapse animations quickly. The user zooms in on the map to find an area of interest, then draws a rectangle around the area to save it as a GeoJSON file – a file that contains the geographic coordinates of the chosen region. Then the user uploads the GeoJSON file to the web app, chooses the satellite to view from and the dates and submits it. It takes the app about 60 seconds to then produce a time-lapse animation.

How to create satellite time-lapse animations.

There are several other useful tools for easily creating satellite animations. Others to try include Snazzy-EE-TS-GIF, an Earth Engine App for creating Landsat animations, and Planetary Computer Explorer, an explorer for searching and visualizing satellite imagery interactively.The Conversation

About the Author:

Qiusheng Wu, Assistant Professor of Geography and Sustainability, University of Tennessee

This article is republished from The Conversation under a Creative Commons license. Read the original article.

ChatGPT is great – you’re just using it wrong

By Jonathan May, University of Southern California 

It doesn’t take much to get ChatGPT to make a factual mistake. My son is doing a report on U.S. presidents, so I figured I’d help him out by looking up a few biographies. I tried asking for a list of books about Abraham Lincoln and it did a pretty good job:

screen capture of text
A reasonable list of books about Lincoln.
Screen capture by Jonathan May., CC BY-ND

Number 4 isn’t right. Garry Wills famously wrote “Lincoln at Gettysburg,” and Lincoln himself wrote the Emancipation Proclamation, of course, but it’s not a bad start. Then I tried something harder, asking instead about the much more obscure William Henry Harrison, and it gamely provided a list, nearly all of which was wrong.

screen capture of text
Books about Harrison, fewer than half of which are correct.
Screen capture by Jonathan May., CC BY-ND

Numbers 4 and 5 are correct; the rest don’t exist or are not authored by those people. I repeated the exact same exercise and got slightly different results:

screen capture of text
More books about Harrison, still mostly nonexistent.
Screen capture by Jonathan May., CC BY-ND

This time numbers 2 and 3 are correct and the other three are not actual books or not written by those authors. Number 4, “William Henry Harrison: His Life and Times” is a real book, but it’s by James A. Green, not by Robert Remini, a well-known historian of the Jacksonian age.

I called out the error and ChatGPT eagerly corrected itself and then confidently told me the book was in fact written by Gail Collins (who wrote a different Harrison biography), and then went on to say more about the book and about her. I finally revealed the truth and the machine was happy to run with my correction. Then I lied absurdly, saying during their first hundred days presidents have to write a biography of some former president, and ChatGPT called me out on it. I then lied subtly, incorrectly attributing authorship of the Harrison biography to historian and writer Paul C. Nagel, and it bought my lie.

When I asked ChatGPT if it was sure I was not lying, it claimed that it’s just an “AI language model” and doesn’t have the ability to verify accuracy. However it modified that claim by saying “I can only provide information based on the training data I have been provided, and it appears that the book ‘William Henry Harrison: His Life and Times’ was written by Paul C. Nagel and published in 1977.”

This is not true.

Words, not facts

It may seem from this interaction that ChatGPT was given a library of facts, including incorrect claims about authors and books. After all, ChatGPT’s maker, OpenAI, claims it trained the chatbot on “vast amounts of data from the internet written by humans.”

However, it was almost certainly not given the names of a bunch of made-up books about one of the most mediocre presidents. In a way, though, this false information is indeed based on its training data.

As a computer scientist, I often field complaints that reveal a common misconception about large language models like ChatGPT and its older brethren GPT3 and GPT2: that they are some kind of “super Googles,” or digital versions of a reference librarian, looking up answers to questions from some infinitely large library of facts, or smooshing together pastiches of stories and characters. They don’t do any of that – at least, they were not explicitly designed to.

Sounds good

A language model like ChatGPT, which is more formally known as a “generative pretrained transformer” (that’s what the G, P and T stand for), takes in the current conversation, forms a probability for all of the words in its vocabulary given that conversation, and then chooses one of them as the likely next word. Then it does that again, and again, and again, until it stops.

So it doesn’t have facts, per se. It just knows what word should come next. Put another way, ChatGPT doesn’t try to write sentences that are true. But it does try to write sentences that are plausible.

When talking privately to colleagues about ChatGPT, they often point out how many factually untrue statements it produces and dismiss it. To me, the idea that ChatGPT is a flawed data retrieval system is beside the point. People have been using Google for the past two and a half decades, after all. There’s a pretty good fact-finding service out there already.

In fact, the only way I was able to verify whether all those presidential book titles were accurate was by Googling and then verifying the results. My life would not be that much better if I got those facts in conversation, instead of the way I have been getting them for almost half of my life, by retrieving documents and then doing a critical analysis to see if I can trust the contents.

Improv partner

On the other hand, if I can talk to a bot that will give me plausible responses to things I say, it would be useful in situations where factual accuracy isn’t all that important. A few years ago a student and I tried to create an “improv bot,” one that would respond to whatever you said with a “yes, and” to keep the conversation going. We showed, in a paper, that our bot was better at “yes, and-ing” than other bots at the time, but in AI, two years is ancient history.

I tried out a dialogue with ChatGPT – a science fiction space explorer scenario – that is not unlike what you’d find in a typical improv class. ChatGPT is way better at “yes, and-ing” than what we did, but it didn’t really heighten the drama at all. I felt as if I was doing all the heavy lifting.

After a few tweaks I got it to be a little more involved, and at the end of the day I felt that it was a pretty good exercise for me, who hasn’t done much improv since I graduated from college over 20 years ago.

screen capture of text
A space exploration improv scene the author generated with ChatGPT.
Screen capture by Jonathan May., CC BY-ND

Sure, I wouldn’t want ChatGPT to appear on “Whose Line Is It Anyway?” and this is not a great “Star Trek” plot (though it’s still less problematic than “Code of Honor”), but how many times have you sat down to write something from scratch and found yourself terrified by the empty page in front of you? Starting with a bad first draft can break through writer’s block and get the creative juices flowing, and ChatGPT and large language models like it seem like the right tools to aid in these exercises.

And for a machine that is designed to produce strings of words that sound as good as possible in response to the words you give it – and not to provide you with information – that seems like the right use for the tool.The Conversation

About the Author:

Jonathan May, Research Associate Professor of Computer Science, University of Southern California

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Limits to computing: A computer scientist explains why even in the age of AI, some problems are just too difficult

By Jie Wang, UMass Lowell 

Empowered by artificial intelligence technologies, computers today can engage in convincing conversations with people, compose songs, paint paintings, play chess and go, and diagnose diseases, to name just a few examples of their technological prowess.

These successes could be taken to indicate that computation has no limits. To see if that’s the case, it’s important to understand what makes a computer powerful.

There are two aspects to a computer’s power: the number of operations its hardware can execute per second and the efficiency of the algorithms it runs. The hardware speed is limited by the laws of physics. Algorithms – basically sets of instructions – are written by humans and translated into a sequence of operations that computer hardware can execute. Even if a computer’s speed could reach the physical limit, computational hurdles remain due to the limits of algorithms.

These hurdles include problems that are impossible for computers to solve and problems that are theoretically solvable but in practice are beyond the capabilities of even the most powerful versions of today’s computers imaginable. Mathematicians and computer scientists attempt to determine whether a problem is solvable by trying them out on an imaginary machine.

An imaginary computing machine

The modern notion of an algorithm, known as a Turing machine, was formulated in 1936 by British mathematician Alan Turing. It’s an imaginary device that imitates how arithmetic calculations are carried out with a pencil on paper. The Turing machine is the template all computers today are based on.

To accommodate computations that would need more paper if done manually, the supply of imaginary paper in a Turing machine is assumed to be unlimited. This is equivalent to an imaginary limitless ribbon, or “tape,” of squares, each of which is either blank or contains one symbol.

The machine is controlled by a finite set of rules and starts on an initial sequence of symbols on the tape. The operations the machine can carry out are moving to a neighboring square, erasing a symbol and writing a symbol on a blank square. The machine computes by carrying out a sequence of these operations. When the machine finishes, or “halts,” the symbols remaining on the tape are the output or result.

What is a Turing machine?

Computing is often about decisions with yes or no answers. By analogy, a medical test (type of problem) checks if a patient’s specimen (an instance of the problem) has a certain disease indicator (yes or no answer). The instance, represented in a Turing machine in digital form, is the initial sequence of symbols.

A problem is considered “solvable” if a Turing machine can be designed that halts for every instance whether positive or negative and correctly determines which answer the instance yields.

Not every problem can be solved

Many problems are solvable using a Turing machine and therefore can be solved on a computer, while many others are not. For example, the domino problem, a variation of the tiling problem formulated by Chinese American mathematician Hao Wang in 1961, is not solvable.

The task is to use a set of dominoes to cover an entire grid and, following the rules of most dominoes games, matching the number of pips on the ends of abutting dominoes. It turns out that there is no algorithm that can start with a set of dominoes and determine whether or not the set will completely cover the grid.

Keeping it reasonable

A number of solvable problems can be solved by algorithms that halt in a reasonable amount of time. These “polynomial-time algorithms” are efficient algorithms, meaning it’s practical to use computers to solve instances of them.

Thousands of other solvable problems are not known to have polynomial-time algorithms, despite ongoing intensive efforts to find such algorithms. These include the Traveling Salesman Problem.

The Traveling Salesman Problem asks whether a set of points with some points directly connected, called a graph, has a path that starts from any point and goes through every other point exactly once, and comes back to the original point. Imagine that a salesman wants to find a route that passes all households in a neighborhood exactly once and returns to the starting point.

The Traveling Salesman Problem quickly gets out of hand when you get beyond a few destinations.

These problems, called NP-complete, were independently formulated and shown to exist in the early 1970s by two computer scientists, American Canadian Stephen Cook and Ukrainian American Leonid Levin. Cook, whose work came first, was awarded the 1982 Turing Award, the highest in computer science, for this work.

The cost of knowing exactly

The best-known algorithms for NP-complete problems are essentially searching for a solution from all possible answers. The Traveling Salesman Problem on a graph of a few hundred points would take years to run on a supercomputer. Such algorithms are inefficient, meaning there are no mathematical shortcuts.

Practical algorithms that address these problems in the real world can only offer approximations, though the approximations are improving. Whether there are efficient polynomial-time algorithms that can solve NP-complete problems is among the seven millennium open problems posted by the Clay Mathematics Institute at the turn of the 21st century, each carrying a prize of US$1 million.

Beyond Turing

Could there be a new form of computation beyond Turing’s framework? In 1982, American physicist Richard Feynman, a Nobel laureate, put forward the idea of computation based on quantum mechanics.

What is a quantum computer?

In 1995, Peter Shor, an American applied mathematician, presented a quantum algorithm to factor integers in polynomial time. Mathematicians believe that this is unsolvable by polynomial-time algorithms in Turing’s framework. Factoring an integer means finding a smaller integer greater than 1 that can divide the integer. For example, the integer 688,826,081 is divisible by a smaller integer 25,253, because 688,826,081 = 25,253 x 27,277.

A major algorithm called the RSA algorithm, widely used in securing network communications, is based on the computational difficulty of factoring large integers. Shor’s result suggests that quantum computing, should it become a reality, will change the landscape of cybersecurity.

Can a full-fledged quantum computer be built to factor integers and solve other problems? Some scientists believe it can be. Several groups of scientists around the world are working to build one, and some have already built small-scale quantum computers.

Nevertheless, like all novel technologies invented before, issues with quantum computation are almost certain to arise that would impose new limits.The Conversation

About the Author:

Jie Wang, Professor of Computer Science, UMass Lowell

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Ada Lovelace’s skills with language, music and needlepoint contributed to her pioneering work in computing

By Corinna Schlombs, Rochester Institute of Technology 

Ada Lovelace, known as the first computer programmer, was born on Dec. 10, 1815, more than a century before digital electronic computers were developed.

Lovelace has been hailed as a model for girls in science, technology, engineering and math (STEM). A dozen biographies for young audiences were published for the 200th anniversary of her birth in 2015. And in 2018, The New York Times added hers as one of the first “missing obituaries” of women at the rise of the #MeToo movement.

Ada King, Countess of Lovelace, was more than just another mathematician.
Watercolor portrait of Ada King, Countess of Lovelace by Alfred Edward Chalon via Wikimedia

But Lovelace – properly Ada King, Countess of Lovelace after her marriage – drew on many different fields for her innovative work, including languages, music and needlecraft, in addition to mathematical logic. Recognizing that her well-rounded education enabled her to accomplish work that was well ahead of her time, she can be a model for all students, not just girls.

Lovelace was the daughter of the scandal-ridden romantic poet George Gordon Byron, aka Lord Byron, and his highly educated and strictly religious wife Anne Isabella Noel Byron, known as Lady Byron. Lovelace’s parents separated shortly after her birth. At a time when women were not allowed to own property and had few legal rights, her mother managed to secure custody of her daughter.

Growing up in a privileged aristocratic family, Lovelace was educated by home tutors, as was common for girls like her. She received lessons in French and Italian, music and in suitable handicrafts such as embroidery. Less common for a girl in her time, she also studied math. Lovelace continued to work with math tutors into her adult life, and she eventually corresponded with mathematician and logician Augustus De Morgan at London University about symbolic logic.

antique black-and-white photograph of a woman in an elaborate outfit
A rare photograph of Ada Lovelace.
Daguerreotype by Antoine Claudet via Wikimedia

Lovelace’s algorithm

Lovelace drew on all of these lessons when she wrote her computer program – in reality, it was a set of instructions for a mechanical calculator that had been built only in parts.

The computer in question was the Analytical Engine designed by mathematician, philosopher and inventor Charles Babbage. Lovelace had met Babbage when she was introduced to London society. The two related to each other over their shared love for mathematics and fascination for mechanical calculation. By the early 1840s, Babbage had won and lost government funding for a mathematical calculator, fallen out with the skilled craftsman building the precision parts for his machine, and was close to giving up on his project. At this point, Lovelace stepped in as an advocate.

To make Babbage’s calculator known to a British audience, Lovelace proposed to translate into English an article that described the Analytical Engine. The article was written in French by the Italian mathematician Luigi Menabrea and published in a Swiss journal. Scholars believe that Babbage encouraged her to add notes of her own.

Ada Lovelace envisioned in the early 19th century the possibilities of computing.

In her notes, which ended up twice as long as the original article, Lovelace drew on different areas of her education. Lovelace began by describing how to code instructions onto cards with punched holes, like those used for the Jacquard weaving loom, a device patented in 1804 that used punch cards to automate weaving patterns in fabric.

Having learned embroidery herself, Lovelace was familiar with the repetitive patterns used for handicrafts. Similarly repetitive steps were needed for mathematical calculations. To avoid duplicating cards for repetitive steps, Lovelace used loops, nested loops and conditional testing in her program instructions.

The notes included instructions on how to calculate Bernoulli numbers, which Lovelace knew from her training to be important in the study of mathematics. Her program showed that the Analytical Engine was capable of performing original calculations that had not yet been performed manually. At the same time, Lovelace noted that the machine could only follow instructions and not “originate anything.”

a yellowed sheet of paper with spreadsheet-like lines
Ada Lovelace created this chart for the individual program steps to calculate Bernoulli numbers.
Courtesy of Linda Hall Library of Science, Engineering & Technology, CC BY-ND

Finally, Lovelace recognized that the numbers manipulated by the Analytical Engine could be seen as other types of symbols, such as musical notes. An accomplished singer and pianist, Lovelace was familiar with musical notation symbols representing aspects of musical performance such as pitch and duration, and she had manipulated logical symbols in her correspondence with De Morgan. It was not a large step for her to realize that the Analytical Engine could process symbols — not just crunch numbers — and even compose music.

A well-rounded thinker

Inventing computer programming was not the first time Lovelace brought her knowledge from different areas to bear on a new subject. For example, as a young girl, she was fascinated with flying machines. Bringing together biology, mechanics and poetry, she asked her mother for anatomical books to study the function of bird wings. She built and experimented with wings, and in her letters, she metaphorically expressed her longing for her mother in the language of flying.

Despite her talents in logic and math, Lovelace didn’t pursue a scientific career. She was independently wealthy and never earned money from her scientific pursuits. This was common, however, at a time when freedom – including financial independence – was equated with the capability to impartially conduct scientific experiments. In addition, Lovelace devoted just over a year to her only publication, the translation of and notes on Menabrea’s paper about the Analytical Engine. Otherwise, in her life cut short by cancer at age 37, she vacillated between math, music, her mother’s demands, care for her own three children, and eventually a passion for gambling. Lovelace thus may not be an obvious model as a female scientist for girls today.

However, I find Lovelace’s way of drawing on her well-rounded education to solve difficult problems inspirational. True, she lived in an age before scientific specialization. Even Babbage was a polymath who worked in mathematical calculation and mechanical innovation. He also published a treatise on industrial manufacturing and another on religious questions of creationism.

But Lovelace applied knowledge from what we today think of as disparate fields in the sciences, arts and the humanities. A well-rounded thinker, she created solutions that were well ahead of her time.The Conversation

About the Author:

Corinna Schlombs, Associate Professor of History, Rochester Institute of Technology

This article is republished from The Conversation under a Creative Commons license. Read the original article.