Is the future of research voice controlled? It might be, because when I originally had the idea for this post my first instinct was to grab my phone and dictate my half-formed ideas into a note, rather than typing it out. Writing things down often makes them seem wrong and not at all what we are trying to say in our heads. (Maybe it’s not so new, since as you may remember Socrates had a similar instinct.) The idea came out of a few different talks at the national Code4Lib conference held in Los Angeles in March of 2017 and a talk given by Chris Bourg. Among these presentations the themes of machine learning, artificial intelligence, natural language processing, voice search, and virtual assistants intersect to give us a vision for what is coming. The future might look like a system that can parse imprecise human language and turn it into an appropriately structured search query in a database or variety of databases, bearing in mind other variables, and return the correct results. Pieces of this exist already, of course, but I suspect over the next few years we will be building or adapting tools to perform these functions. As we do this, we should think about how we can incorporate our values and skills as librarians into these tools along the way.
Natural Language Processing
I will not attempt to summarize natural language processing (NLP) here, except to say that speaking to a computer requires that the computer be able to understand what we are saying. Human-or natural-language is messy, full of nuance and context that requires years for people to master, and even then often leads to misunderstandings that can range from funny to deadly. Using a machine to understand and parse natural language requires complex techniques, but luckily there are a lot of tools that can make the job easier. For more details, you should review the NLP talks by Corey Harper and Nathan Lomeli at Code4Lib. Both these talks showed that there is a great deal of complexity involved in NLP, and that its usefulness is still relatively confined. Nathan Lomeli puts it like this. NLP can “cut strings, count beans, classify things, and correlate everything”. Given a corpus, you can use NLP tools to figure out what certain words might be, how many of those words there are, and how they might connect to each other.
Processing language to understand a textual corpus has a long history but is now relatively easy for anyone to do with the tools out there. The easiest is Voyant Tools, which is a project by Sinclair, Stéfan Sinclair and Geoffrey Rockwell. It is a portal to a variety of tools for NLP. You can feed it a corpus and get back all kind of counts and correlations. For example, Franny Gaede and I used VoyantTools to analyze social justice research websites to develop a social justice term corpus for a research project. While a certain level of human review is required for any such project, it’s possible to see that this technology can replace a lot of human-created language. This is already happening, in fact. A tool called Wordsmith can create convincing articles about finance, sports, and technology, or really any field with a standard set of inputs and outputs in writing. If computers are writing stories, they can also find stories.
Talking to the Voice in the Machine
Finding those stories, and in turn, finding the data with which to tell more stories, is where machine learning and artificial intelligence enter. In libraries we have a lot of words, and while we have various projects that are parsing those words and doing things with them, we have only begun to see where this can go. There are two sides to this. Chris Bourg’s talk at Harvard Library Leadership in a Digital Age, asks the question “What happens to libraries and librarians when machines can read all the books?” One suggestion she makes is that:
we would be wise to start thinking now about machines and algorithms as a new kind of patron – a patron that doesn’t replace human patrons, but has some different needs and might require a different set of skills and a different way of thinking about how our resources could be used.
One way in which we can start to address the needs of machines as patrons is by creating searches that work with them, which is for now ultimately to serve the needs of humans, but in the future could be for their own artificial intelligence purposes. Most people are familiar with virtual assistants that have popped up on all platforms over the past few years. As an iOS and a Windows user, I am now constantly invited to speak to Siri or Cortana to search for answers to my questions or fix something in my schedule. While I’m perfectly happy to ask Siri to remind me to bring my laptop to work at 7:45 AM or to wake me up in 20 minutes, I find mixed results when I try to ask a more complex question. Sometimes when I ask the temperature on the surface of Jupiter I get the answer, other times I get today’s weather in a town called Jupiter. This is not too surprising, as asking “What is the temperature of Jupiter?” could mean a number of things. It’s on the human to specify to the computer to which domain of knowledge you are referring, which requires knowing exactly how to ask the question. Computers cannot yet do a reference interview, since they cannot pick up on the subtle hidden meanings or helping with the struggle for the right words that librarians do so well. But they can help with certain types of research tasks quite well, if you know how to ask the question. Eric Frierson (PPT) gave a demonstration of his project working on voice powered search in EBSCO using Alexa. In the presentation he demonstrates the Alexa “skills” he set up for people to ask Alexa for help. They are “do you have”, “the book”, “information about”, “an overview of”, “what I should read after”, or “books like”. There is a demonstration of what this looks like on YouTube. The results are useful when you say the correct thing in the correct order, and for an active user it would be fairly quick to learn what to say, just as we learn how best to type in a search query in various services.
Why ask a question of a computer rather than type in a question to a computer? For the reason I started this piece with, certainly-voice is there, and it’s often easier to say what you mean than write it. This can be taken pragmatically as well. If you find typing difficult, being able to speak makes life easier. When I was home with a newborn baby I really appreciated being able to dictate and ask Siri about the weather forecast and what time the doctor’s appointment was. Herein lies one of the many potential pitfalls of voice: who is listening to what you are saying? One recent news story puts this in perspective, as Amazon agreed to turn over data from Alexa to police in a murder investigation after the suspect gave the ok. They refused to do at first, but it is an open question as to the legal nature of the conversation with a virtual assistant. Nor is it entirely clear when you speak to a device where the data is being processed. So before we all rush out and write voice search tools for all our systems, it is useful to think about where that data lives what the purpose of it is.
If we would protect a user’s search query by ensuring that our catalogs are encrypted (and let’s be honest, we aren’t there yet), how do we do the same for virtual search assistants in the library catalog? For Alexa, that’s built into creating an Alexa skill, since a basic requirement for the web service used is that it meet Amazon’s security requirements. But if this data is subject to subpoena, we would have to think about it in the same way we would any other search data on a third party system. And we also have to recognize that these tools are created by these companies for commercial purposes, and part of that is to gather data about people and sell things to them based on that data. Machine learning could eventually build on that to learn a lot more about people than they think, which the Amazon Echo Look recently brought up as a subject of debate. There are likely to be other services popping up in addition to those offered by Amazon, Google, Apple, and Microsoft. Before long, we might expect our vendors to be offering voice search in their interfaces, and we need to be aware of the transmission of that data and where it is being processed. A recent alliance formed called The Voice Privacy Alliance, which is developing some standards for this.
The invisibility of the result processing has another dark side. The biases inherent in the algorithms become even more hidden, as the first result becomes the “right” one. If Siri tells me the weather in Jupiter, that’s a minor inconvenience, but if Siri tells me that “Black girls” are something hypersexualized, as Safiya Noble has found that Google does, do I (or let’s say, a kid) necessarily know something has gone wrong? Without human intervention and understanding, machines can perpetuate the worst side of humanity.
This comes back to Chris Bourg’s question. What happens to librarians when machines can read all the books, and have a conversation with patrons about those books? Luckily for us, it is unlikely that artificial intelligence will ever be truly self-aware with desires, metacognition, love, and need for growth and adventure. Those qualities will continue to make librarians useful to creating vibrant and unique collections and communities. But we will need to fit that in a world where we are having conversations with our computers about those collections and communities.
Article by channel:
Everything you need to know about Digital Transformation
The best articles, news and events direct to your inbox
Read more articles tagged: Natural Language Processing