Tuesday, October 29, 2024

Why the Hype Cycle Does not Quite Work For Generative AI

Gartner Hype Cycle Graphic

(Not) The Same As It Ever Was

Many years (!) ago, I wrote here about why the Virtual World Second Life had followed, fairly well, the Gartner Group's well known "hype cycle" for new technology. Follow the link here to read about the history of the concept.

I'm reminded by Wagner James Au of New World Notes that other virtual worlds such as Fortnite and Minecraft have thrived, and indeed, have seen some good use-cases made for education.

At the same time, I don't hear of anyone at Capitol One's hive using virtual worlds. We don't see links from Office 365 or Google Workspaces to a virtual world. I have a sense that AI is a different animal for these worker bees and their daily applications, as well as my own students.

I do know a freelance graphic designer who lost her primary client to AI work; the client outsources graphics to AI run by semi-skilled staff in-house. My friend has moved on to ramp up her greeting-card business. That may color my skepticism that AI is a "fad," as Second Life clearly was.

SL and Gartner

I joined the virtual world in 2007, and yes, I was foolishly an enthusiast. Since I believe in being forthcoming about personal mistakes, I recall telling my students investigating this Strange Land that "you will all have avatars in a few years, for work!" I let slip the same to some colleagues, one of whom aid "you never met a technology you didn't love!"

Cringe. 

I'll own being horribly wrong. Prognostication is a dicey thing! A silly image I cobbled together years ago shows the evolution of Second Life avatars (namely, my own, Iggy). At the same time, my update to the old image points toward a new direction in desktop and mobile computing that we all felt, a decade ago, to be far over the horizon.

Various Second Life Versions of the Author plus an AI writer in the year 2024

For many reason, including the difficulty of building simulations, a clunky interface, heavy system requirements, even poor management by Linden Lab, Second Life began a descent from what Gartner calls a "Peak of Inflated Expectations" in 2007.

By 2012 Second Life was well into a "Trough of Disillusionment." As early as 2008 I sensed that the media-party was over. A year later, I cited three factors working against academic use in a post here: Lack of rewards and incentives for faculty, the time needed to master the interface and content, Linden Lab's vacillating leadership and policies, the presence of what I and my students called "the creepy treehouse" of adult users doing sometimes naughty things.

In the for-profit sector, businesses that set up virtual storefronts in 2007 soon fled; a professor of Marketing I helped get started found SL fascinating, but after interviewing in-world merchants, she put it plainly: for brick-and-mortar firms, there's little return on investment in Second Life.

SL continues to be a large virtual world, with better graphics, promises of a new physics engine, and a dedicated base of users, but it did not become a daily experience on the job or at school. The Hype Cycle proved accurate. This has happened before: stand-alone creative hypertexts such as Michael Joyce's afternoon also map onto the cycle, though never regaining the popularity they enjoyed in the 1990s or even a steady base of users. Creative hypertexts appear to have faded utterly.

When a colleague in our campus digital pedagogy cohort claimed that generative AI would follow the same path traced by Gartner, I bristled because this time I feel that Gartner's model does not fit what I am seeing, daily, among students.

I could be wrong if a series of issues afflict AI. So let's compare the Second Life experience to that of generative AI:

The Browser

SL was never browser-based, even as a plugin, unlike the easy-to-master Unity 3D interface; AIs are built already into tools we use daily or can be queried from any Web browser.

Client v. Server Side

SL has high client-side requirements, whereas AI puts the burden on the server farms; these use enormous amounts of energy that my students and colleagues only recently started to notice, with all the inherent dangers related to sustainability.

The Interface 

SL has a difficult interface, employing menus worthy of an entire operating system and arcane coding options, whereas Gen AI's query features resemble (for now) those of a familiar search engine but with natural-language input. 

Platform Independence

SL and AI both work on multiple operating systems, but SL has never worked well with mobile devices; AI is baked into many apps now, with some Google searches automatically invoking Gemini results and Apple promising Apple Intelligence in their next phones. 

Creepy Treehouse

 SL has adult content that troubled my students; the guardrails we increasingly see with AI tend to keep users (sometimes clumsily or incompletely) from making Napalm or pornography.

Investment Sources

Investment in SL comes from individual users at this point; AI attracts venture capital, user subscriptions, and big-tech investment (Apple, Microsoft, Meta, Google).

Source of Growth in Academia

Faculty and campus-technologist curiosity spurred interest in SL; student use erupted for AI spontaneously in the Fall of 2022 and remains very strong. 

A Shaky Prognosis

To be fair to Second Life, system load has largely remained steady in the last decade. In my recent experience, I can run SL and a third-party client called Firestorm on my MacBook using no more than my phone's hotspot to connect to the world. That's astounding progress.

I don't foresee any of these differences posing a stumbling block for AI, leading to a Trough of Disillusionment, save energy use. We have a climate crisis; the last thing we can afford as a civilization would be to ramp up emissions of Carbon Dioxide.  

Yet if Moore's Law continues apace, we'll see AI reap rewards of lower power use because of more powerful processors in the servers that run it. Any disillusionment is likely to be temporary and its "trough" quite shallow.

For these reasons, I don't see the Hype Cycle apply to Generative AI in the same way as it has to several earlier technologies. I may be wrong, but for now Gartner's model does not fit it well.

Image Sources: Wikipedia Commons for Hype Cycle.

Friday, October 25, 2024

Wendell Berry's "Tiny No" to Computers (and Artificial Intelligence)

Wendell Berry with solar panels

I've had a very productive conversation with my colleague Thomas, who is a "hard no" person when it comes to AI in classes. We agree that reasoned Humanist dissent should be seated at the table, even as Humanists such as I do invite AI to take a seat.

Here I employ a metaphor used in Ethan Mollick's book Co-Intelligence: Living and Working With AI. I fear that Mollick may miss some valid reasons, from a Humanist perspective, for being very wary of what AI may do to our minds, our reading and writing habits. In a year with a candidate who walks and talks like an authoritarian, perhaps of a fascist inclination, much rides on how new technologies will influence civic discourse in the coming years.

Thomas holds ideas similar to those espoused in the article "ChatGPT Does Not Have to Ruin College," online at The Atlantic. I like many of those reasons for resisting the hype-cycle around AI, but I also turn to a much older set of caveats, espoused back in 1988 at the dawn of the personal-computing age.

Wendell Berry's "Why I'm Not Going to Be Buying a Computer" caused a stir when it ran in Harpers, and it still rankles some of us who have found, say, blogging really good for one's writing muscles. Berry still holds that his "tiny no" was the right answer to make.

While I disagree broadly with Berry on computing, I do find one aspect of his refusal very compelling. In his essay he lists nine criteria for adopting a new tool:

  1. The new tool should be cheaper than the one it replaces. 
  2. It should be at least as small in scale as the one it replaces. 
  3. It should do work that is clearly and demonstrably better than the one it replaces.
  4. It should use less energy than the one it replaces. 
  5. If possible, it should use some form of solar energy, such as that of the body. 
  6. It should be repairable by a person of ordinary intelligence, provided that he or she has the necessary tools. 
  7. It should be purchasable and repairable as near to home as possible. 
  8. It should come from a small, privately-owned shop or store that will take it back for maintenance and repair. 
  9. It should not replace or disrupt anything good that already exists, and this includes family and community relationships. 

Generative AI fails, by my reckoning, most of these tests. It does, arguably, do better work than the traditional search engine (test 3). Otherwise, it fails tests 4 and 9 badly. I suppose in time AI server-farms might be covered with solar panels (test 5) but Japan's decision to restart its nuclear power plants to power AI, as well as Microsoft's recommissioning of Three-Mile Island for AI power argue otherwise. 

Berry's no Luddite. Note the solar panels in the Wikipedia image I chose for this post. I'm reminded of how Howard Rheingold called the Amish "adaptive techno-selectives" in his insightful 1999 feature piece about mobile phones, "Look Who's Talking."

We will know, in time, if Berry proves correct, as he says in the recent interview, that "you could just ask your computer and it’ll tell you. But this doesn’t contribute to the formation of a mind." What is learning, after all? I've long distinguished information from knowledge. Having more information, under the scrutiny of my admittedly imperfect powers of reasoning and critical thinking, builds a store of useful knowledge. As a farmer as well as academic, I know things Berry does, too. I can judge when a field is ready for a cover-crop by hard-earned experience, but I also go online for weather forecasts, advice about soil conditions in Central Virginia, organic methods for controlling pests.  

My concern, however, is that we may offload reasoning to large language models, whose propensity to hallucinate without really good prompt-engineering has been well documented in journalistic and scholarly work. A feedback loop results: can we detect these errors if doing so requires the very reasoning powers we are using less frequently?

I don't know, but I do know that naysayers such as my colleague, Wendell Berry, and others who thoughtfully resist marketing hypberbole need a seat at the table.

Thursday, October 17, 2024

No More Dangerous Visions: AI Guardrails and Humanities Work

 

Female undergradate working in library with AI Copilot

After eight years, this blog creaks, groans, stretches, and returns to life not about virtual worlds, but the newest technology on the block: Generative AI.

 Today my students put essays to two different Large Language Models (LLMs) hosted on our campus. All of them first used Claude Sonnet 3.5 and then the groups divided, with half putting the essays to ChatGPT 4 and the others Google Gemini Pro.  By the time you read this, the versions and even names may have changed.

In any case, I asked the students to provide the assignment prompt from a partner's paper, as well as the text of the paper itself. Then the students were to compare how the LLMs' advice compares to that of a human partner as well as Grammarly. 

For three students with Humanities topics, Anthropic's Claude refused, like Barletby the Scrivener in Melville's story, to do any work. It locked the students out of the software and logged them off.

Why? Subject matter and explicit language quite appropriate when quoting from some literary works or discussion topics such as human trafficking or sexual violence. So then we quizzed Claude about why. I asked it a general question about which words or phrases it flags as "unsafe" and got this reply:

As an AI language model, I don't actually have a system for identifying "unsafe content" or refusing to answer based on specific words or phrases. I'm designed to engage in a wide range of conversations while maintaining ethical boundaries. However, I do have guidelines that help me respond appropriately to different types of queries: 

  1. I don't engage in or promote illegal activities. 
  2. I avoid producing explicit sexual content or pornography. 
  3. I don't assist in creating harmful content like hate speech or extreme political propaganda. 
  4. I don't share personal information about real individuals. 
  5. I avoid giving medical, legal, or financial advice that should come from professionals. 

To avoid any issues, it's generally best to: 

  1. Keep queries legal and ethical. 
  2. Avoid requests for explicit sexual content. 
  3. Don't ask for help with harmful or dangerous activities.
  4. Stick to general knowledge rather than specific personal information. 
  5. For professional advice, consult experts in the relevant field. 

Remember, I'm here to help with a wide range of topics, and I'll do my best to provide useful information while staying within appropriate boundaries. If a query is problematic, I'll typically explain why I can't assist and suggest alternative approaches if possible.

One of my students whose partner's project got flagged as "unsafe" got the AI to provide further details for that one case:

I understand your frustration with the content being flagged. While I don't have access to the specific algorithms used to flag content, there are certain words and phrases commonly associated with explicit sexual content that might trigger such flags. These often include: 

  • Anatomical terms for genitalia 
  • Explicit verbs describing sexual acts 
  • Slang terms for sexual activities 
  • Graphic descriptions of bodily fluids or functions 
  • Certain words associated with pornography or sex work 

To work around this, you could try: 

  • Using more clinical or academic terminology 
  • Focusing on the themes and concepts rather than specific acts 
  • Summarizing the content instead of quoting directly 
  • Using euphemisms or less explicit language 
If you'd like, you can try describing the general topic and main points of the paragraph without using potentially flagged words. I can then suggest how to phrase it in a way that's likely to pass content filters while still conveying the essential information.

Thus an AI would not be able to work with some of Allen Ginsberg's or Walt Whitman's poetry, Toni Morrison's Beloved, and many other literary works, let alone news accounts of many topics that contain graphic language or imagery. Such prudence appears well intentioned but bothersome to this writer, since our purpose was not to titillate a reader but to inform and educate with the help of the cybernetic co-pilot. 

We very much need Humanists, not just Business and STEM folks, using AI in canny ways. Some of our work will involve dangerous topics. We Humanists ask some very hard questions not germane in other fields of study.

Good. AI and its training data need cross-examination. I'm heartened that issues of sustainability and social justice have begun to be considered. All technologies include unforeseen social and economic consequences.

Before we part, back to why I revived "In a Strange Land." I feel the same Uncanny Valley moment I did in 2007, when I first logged into a virtual world. As with that, AI reveals new and sometimes unsettling terrain.

Incidentally, I won't rule out future posts about virtual worlds, though Iggy my Second Life avatar gave up his virtual land and wanders penniless around the once-hyped simulation. I also dip into a roleplaying game build inside SL occasionally, playing a virtual pirate, to conduct naval battles with others interested in The Age of Sail. 

Dread Pirate Rumbottom O'Malley and Gray Nick the Parrot


Avast! The Royal Navy will never get me!

Images: Prompt to Dall-E, "Female undergraduate working in library with AI Copilot" and from Second Life, the Dread Pirate Rumbottom O'Malley and Gray Nick the parrot.