Decoding AI and Libraries

AI

“Decoding AI and Libraries.” KB College: AI en de Bibliotheek – de computer leest alles. The Hague, Netherlands. (via video conference)

Speech Text: Read Speaker Script
Abstract: How can we think about AI and the role of libraries in AI development?

[This is the script I used for my talk. I’ve also taken the opportunity to add some foot notes and links.]

In the 90s I was working on a project called AskERIC – a service that would answer questions of educators and policy makers online. It was early days of the web, well before Google, Facebook, or Amazon. Yet even then we would regularly get questions about artificial intelligence; “i.e., Can’t machines answer these questions?” My boss’ answer was great – “We’ll use natural intelligence until artificial intelligence catches up.” 

A quarter century later, artificial intelligence has done some significant catching up. From search engines to conversational digital assistance to machine learning embedded in photo apps to identify faces and places, the progress of A.I. is breathtaking.

The last 10 years of progress is particularly impressive when you realize that A.I. has been a quest of computer scientists since before there was such a thing as computer science.

Today the larger conversations of A.I. tend to be either utopian

A.I. will improve medicine, reduce accidents, and decrease global energy use

or dystopian

It will destroy jobs, privacy, and freedom.

A.I. has also become a bit of a marketing term – soon, I fear we’ll be eating our cereals fortified with AI.

The hype and real progress have merged into a bit of a jumbled mess that overall can lead to a sort of awe and inaction.

Awe in that many of us in the library, and particularly public library community, may feel the details are over our head. A.I. is a game for Google. Inaction because the topic seems too big – what role is there for a library when these tools are being created by trillion-dollar industries?

True story, the same day my dean asked me about the possibility of creating a degree in data science and A.I., MIT announced a Billion Dollar plan to create an A.I. college[1]. I don’t think he appreciated me asking him if I would have a billion to work with as well. 

I’ve found these reactions – awe and inaction – are often a result of muddled vocabulary. So, for my contribution to today’s agenda, I’d like to briefly break the conversation down into more precise, and actionable concepts. My focus here will be in the contribution of public libraries, but I believe that the concepts are not only relevant to other public sector organizations but can only be truly implemented with partners of all types.

So rather than just think of A.I. as a big amorphous capability, I ask you to think about three interlocking layers: data, algorithms, and machine learning.

Ready access to masses of data has led to high-impact algorithms and increasingly to machine learning and black box “deep learning” systems. If we, librarians do not seek to have positive impacts at each of these levels – well, to be blunt – I would argue we are not doing our job and putting our communities in danger.

So, let’s begin with data.

The first thing that gets thrown into the A.I. bucket is the idea of data or big data. From data science to analytics there is a global uptick in generating and collecting data. With the advent of always connected digital network devices – read smart phones – in the pockets of global citizens, data has become a new type of raw resource.

And when I say global, I mean it. In 2010 the United Nations reported that there are far more people in the world that have access to a cell phone than to a toilet.

With this connectivity, most in society have simply accepted that one of the costs to being connected is sharing data. Sharing it with a carrier and sharing it with the company who wrote the phone’s software. Apple or Google probably know right now where you are, who you’re with, and if you use Siri or Google Assistant, they are primed to be listening to what you might be saying right now. No, I mean literally, right now.

The phone thing probably doesn’t surprise you. But what about the road you used to get here today? When governments build or repave a road, there is a good likelihood they are embedding sensors into it. Why? Well one reason is to save the environment and money in northern climates. How? In the winter rather than just lay down salt or chemicals on every mile of road, smart sensors can pinpoint where ice melt is needed, and reduce the application of costly chemicals.

Sensors are also used to determine the amount of traffic on the road, when to change signal lights, collect tolls, and check for wear and tear.

Add to this the data generated by cars on the road – digital radio, GPS, increasing autonomous driving – and the data begins to add up.

In fact, by one estimate in a few years each mile of highway in the U.S. will generate a gigabyte of data an hour. As there are 3.5 million miles of highways in the U.S., that would be 3.3 petabytes of data per hour, or 28 exabytes per year.

Just in case you are wondering, five exabytes is enough to hold all words ever spoken by humans from pre-history to about the year 1995. Now imagine over 5 times that a year, just on asphalt.

Now that may seem overwhelming, but at the data layer there is a lot of need, and space for libraries to participate. The questions to ask and develop answers to are familiar. Who has access to that data? How is that data stored and how do you find anything in that exabyte haystack? How do we make people aware of the data they may be sharing? How do we advocate for effective regulation to protect citizens?

I argue that public libraries should be steward of public data. Libraries have a VERY long history of data stewardship that includes respect for privacy and seeking equitable access to information. If we are going to allow our governments and our businesses to harvest data then we need to ensure our communities have a strong say in how that happens and trust in those that make the decisions. Right now, libraries have a stronger level of trust than Apple, Google, Facebook, and most elected governments[2].

The accumulation of data in and of itself is not particularly alarming. As libraries have shown over and over again having a bunch of stuff means nothing if you don’t have systems to find it and use it. This takes us to our second layer of concern in A.I.: algorithms.

Companies and governments alike are using massive computing power to sort through data, much of it identifiable to a single individual, and then these folks make some pretty astounding decisions. Decisions like which ad to show you, or what credit limit to set on your credit card, to what news you see, and even to what health care you receive. In our most liberal democracies software is used to influence elections, and who gets interviewed for jobs.

Charles Duhigg, author of “The Power of Habit[3],” tells the story of an angry father who storms into a department store to confront the store manager. It seems that the store had been sending his 16-year-old daughter a huge number of coupons for pregnancy related items: diapers, baby lotion and such. The father asks the manager if the store is trying to encourage the girl to get pregnant? The manager apologizes to the man and assures him the store will stop immediately. A few days later the manager calls the father, only to find that the daughter was indeed pregnant, and the store knew it before she told her father.

What’s remarkable is that the store knew about the pregnancy without the girl ever telling a soul. The store had determined her condition from looking at what products she was buying, activity on a store credit card, and in crunching through huge amounts of data. If we updated this story from a few years ago we could add her search history and online shopping habits, even her shopping at other physical stores. It is now common practice to use online tracking, WIFI connection history, and unique data identifiers to merge data across a person’s entire life and feed them into software algorithms that dictate the information and opportunities they are presented with.

In her book “Weapons of math destruction[4],” Cathy O’Neil documents story after story of data mining and algorithms that have massive effects in people’s lives, even when they show clear biases and faults. She describes investment algorithms that not only missed the coming financial crisis or 2008, but actually contributed to it. Models that increased student debt, prison time for minorities, and blackballed those with mental health challenges from jobs.

The recurring theme in her work is that these systems are normally put in place with the best of intentions.

And here we see the key issue in the use of software to crunch massive data to make decisions on commerce, health care, credit, even jail sentences. That issue lies in the assumptions that those who use the software make. Often very dubious, and downright dangerous assumptions. Assumptions such as algorithms are objective, and that data collection is somehow a neutral act. Or even, that everything can be represented in a quantitative way – including, by the way, culture[5]and the benefit a person makes to society[6]

What role is there for librarians, curators, and academics here? The answer on the surface is about the same as in our discussion of data. Education, awareness, a voice in regulation. However, we must be very aware of the nature of our voice. 

For too long librarians saw ourselves as neutral actors. We collected, described, and provided materials believing that these acts were either without bias, or that those biases were controlled.

In collecting we took it all…except for works that were self-published, or from sources we deemed of low quality. In cataloging we relied on literary warrant and the language of the community – often ignoring that we only saw the dominant voices of that community. Our services were for all – during our open hours for those who could travel to our buildings.

We as a profession are now waking up to the fact that we are a product of our cultures – good and bad. We understand that the choices we make in everything from collections to programs are just that – choices. Our choices may be guided by best practice, or even enforced by law, but ultimately, they are human choices in a material world where resource decisions must be made.

So as a library we are not asking to be neutral arbiters of data collection and uses. We are seeking to improve society through data and algorithms – that means we have a point of view. We have a definition of what improve means.

However, the biases we bring, or more precisely the principles, we bring to the Googles and Facebooks of the world is that a strong voice that advocates for transparency, privacy, the common good, and a need for a durable memory is important.

We recognize that bias exists even if we can’t always identify it, and so we require diversity and inclusive voices in our work. In this act we are not simply advocates, we are activists. A missionary corps of professionals equipping our communities to fight for their interests.

And this brings us to our last layer. The layer that most purists would say is true artificial intelligence development. The use of software techniques to enable machine learning, and especially the more specific deep learning.

That is, software that allows the creation of algorithms and procedures without human intervention. With techniques like neural nets, Bayesian predictors, Markov models, and deep adversarial networks software sorts through piles and piles of data seeking patterns and predictive power.

An example of machine learning systems in action would be feeding a system a number of prepared examples, say hundreds of MRI scans that are coded for signs of breast cancer. The software builds models over and over again until they can reproduce the results without the prepared examples. The trained system is then set upon vast piles of data using their new internally developed models.

With the wide availability of massive data, newer deep learning techniques do away with the coding, and go straight to iterative learning. Where machine learning used hundreds of coded examples, deep learning sets software free on millions and millions of examples with no coded examples – potentially improving the results and eliminating the labor-intensive teaching phase.

When this works well, it can be more accurate than humans doing the same tasks. Billions of operations per second finding pixel by pixel details humans could never see. And they can do it millions and billions of times never tiring, never getting distracted.

In these A.I. systems there are two issues that librarians need to respond to. The first is that these machine-generated algorithms are only as good as the data they are fed. MRIs are one thing, credit risks are quite another. Just as with our human generated algorithms, these systems are very sensitive to the data they work with.

For example, a maker of bathroom fixtures sold an AI-enhanced soap dispenser. The new dispenser reduced waste because it was extremely accurate at knowing if humans hands were put under the dispenser or say a suitcase at an airport. Extremely accurate, so long as the hands belonged to a white person. The system could not recognize darker skin tones[7]. Why? Was the machine racist? Well, not on its own. It turns out it had been trained on only images of Caucasian hands. 

We see example after example of machine learning systems that exhibit the worst of our unconscious biases. Chat bots that can be hijacked by racists through Twitter, job screening software that kicks out non-western names. Image classifiers labeling images of black people as gorillas[8].

However, bad data ruining a system is nothing new. If you’ve had about 10 seconds of work migrating integrated library systems, you know that all too well.

The real issue here is that the models developed through deep learning are impenetrable. That MRI example looking for breast cancer? The programmers can tell you if the system detected cancer, even the confidence the software has in its prediction. The programmer can’t tell you how it arrived at that decision. That’s a problem. All of those weapons of math destruction Cathy O’Neil described, can be audited. We can pick apart the results and look for biases and error. In deep learning, everything works until, well, an airplane crashes to the ground or an autonomous car goes off the road.

And so what are we to do? This is tricky. There can be no doubt that data analytics, algorithms taking advantage of massive data, and A.I. have provided librarians and society great advantages. Look no further than how Google has become one of a librarian’s greatest tools because it provides not only the ability to search through trillions of web pages in milliseconds, but often serves as a digital document delivery service undreamed of 25 years ago when I was working on AskERIC.

And yet, we still need that natural intelligence my boss, Mike Eisenberg, talked about.

Our communities, and our society, needs a voice to ensure the data being used is representative of all of a community, not just the dominant voice, or the most monetizable. Our communities need support, understanding, and organizing to ensure that the true societal costs of A.I. are evaluated, not simply the benefits.

That may sound like our job is the be the critic or even the luddite, holding back progress. But that’s not what we need. Librarians need to become well versed in these technologies, and participate in their development, not simply dismiss them or hamper them. We must not only demonstrate flaws where they exist but be ready to offer up solutions. Solutions grounded in our values and in the communities we serve.

We need to know the difference between facial identification systems, and facial identification systems that are used to track refugees. We need to know the difference between systems that filter through terabytes of data, and systems that create filter bubbles that reinforce prejudice and extremism.

And today is a great first step to honoring that responsibility.

Thank you, and I look forward to the conversations to come.


[1]https://www.technologyreview.com/f/612293/mit-has-just-announced-a-1-billion-plan-to-create-a-new-college-for-ai/

[2]https://www.youtube.com/watch?v=Tvt-lHZBUwU

[3]http://www.worldcat.org/oclc/881631924

[4]http://www.worldcat.org/oclc/1039545320

[5]his article certainly doesn’t claim that all of cultural heritage can be represented quantitatively. Rather I include the citation because it is a good introduction to the use of quantitative analysis of some cultural material and because it includes the very cool term Culturomics, “Culturomics is the application of high-throughput data collection and analysis to the study of human culture.” https://www.librarian.net/wp-content/uploads/science-googlelabs.pdf

[6]https://www.businessinsider.com/china-social-credit-system-punishments-and-rewards-explained-2018-4

[7]https://www.iflscience.com/technology/this-racist-soap-dispenser-reveals-why-diversity-in-tech-is-muchneeded/

[8]https://www.theguardian.com/technology/2015/jul/01/google-sorry-racist-auto-tag-photo-app

Bibliopocalypse

Dürers, Albrecht. Die Offenbarung des Johannes: 4. Die vier apokalyptischen Reiter, 1497-1498.

David and Daniel Gonçalves of the Bibliotecas são Comunidades blog asked me to write a post for their bog:

The theme of the text is to imagine a world without libraries. The aim is to demonstrate the importance of libraries and how much we depend on them.

As you can see from the text I wrote, this was fun and exciting for me. It also has me thinking about imaginative advocacy. The use of stories, drawings, and creative other creative works to advocate for libraries, but more in general.

I know this is a pretty rich and well developed area. I just started to thinking how I could be a part.

In any case, let me know what you think (the text is in English): Bibliopocalypse

The Library as a Movement

A conversation between Marie Østergaard, Library Director Aarhus Public Libraries in Denmark and R. David Lankes, Director of the University of South Carolina’s School of Library and Information Science on the idea that the library is a movement of communities members, librarians, politicians, partners and more.

If you would rather just listen, here’s an MP3 version.

Audio only of the conversation

Reception at ALA

Davis College

If you are attending this year’s ALA Annual Conference in D.C. or are in the area please join the University of South Carolina School of Library and Information Science for a reception celebrating great librarianship. Great librarianship exemplified by our alumni, faculty, students, and staff.

This year, we will also be celebrating the life of the great librarian and 2019 Margaret E. Monroe Library Adult Services Award winner Nicolette Sosulski, who passed away this year.

So, if you are an alumni, South Carolina librarian, friend of Nicolette, or just want to share some great company, please join us:

Friday June 21st. 6-8pm

The Loft at 600F 4th Floor (Retreat Room)

600 F St NW

Washington, DC  20004

Thank You for Making a Difference

This past week I spent some time in social media and here asking folks to give blood. This was the 6th year in a row that we’ve been part of an Annual Lankes Family Blood Drive in Central New York. We began (though my amazing wife Anna Maria deserves all the credit) the event after my first stem cell/bone marrow transplant. We wanted to use our experience for something positive. People giving blood saved my life and we wanted to give back.

Even though we’ve moved to South Carolina amazing volunteers like Michele McIntyre and Blythe Bennet and the congregation of the Holy Cross Church have kept it going.

We received the following email from Katie Stepanian, our amazing contact with the American Red Cross, and it shows just how many people you can help with your donation.

Thank you for all of your time and efforts with the blood drive this year at Holy Cross Church. Without your personal asks, your volunteer time before and after the drive,  and the support from Holy Cross Church, we would not have been able to meet our goal for this drive and the demand this month. 

From where I sit the event went perfectly, though I would love any feedback you have for me. Below are the stats:

  • Goal- 53 units
  • 43 scheduled
  • 56 registered
  • 8 deferrals
  • 2 turnaways (could have been self deferrals or walkins who could not wait)
  • 43 whole blood donations
  • 5 power red donations
  • 53 units collected (100% to goal!)

There were 7 first time donors!  I will put on my calendar to pull the list of presenting donors for you once it comes through which will be next week. 

I can’t thank you each enough for continuing to help the Red Cross connect with donors in the community. David’s drive is one of the highest producing drives in my territory.  Year after year it yields a significant contribution, and with the addition of these 53 units now totals 374 donations! This brings your total in potential lives saved to 1,122!  WOOO HOOO!!!  Most importantly, you are a reminder to the community about the importance of taking time to donate blood and the why behind our mission.

1,122 people benefited from your generosity. Thank you thank you thank you.

Please give blood if you can: American Red Cross.

How you know you need a blood transfusion in the days and weeks after a bone marrow transplant

Here is how you know you need a blood transfusion in the days and weeks after a bone marrow transplant.

It will start at 3am when the cocktail of fluids and supplements from the previous day’s treatment will wake you up with a strong need to pee. However, no matter the urgency, you can’t just get up. If you don’t slowly sit up and wait; then make sure you flex the muscles in your arms and legs first, you will stand up with too little blood pressure to push oxygen to your brain and you will fall back into bed (if you are lucky- if you are not lucky it will be the floor). Your heart will not provide enough force to push blood up against gravity and so it will pool in your veins, waiting for your major muscle groups to provide pumping action.

Once you get to the bathroom, be sure to either sit down to pee or place a steadying hand on the wall so you don’t sway or make a mess. There is a good chance the effort of the 10 foot walk will also wind you.

Assuming all that goes well, you will need to wake at 7am for the daily drive into the clinic. This will mean you need to fight to wake up. I don’t me deal with being groggy, or wanting to go back to sleep. I mean feeling like you are at the bottom of a 30 foot well, and you have to climb up and will your eyes to open at the top.

Once you’re up, remember your nighttime routine to sit up slowly. Next, as you dress don’t forget the compression socks. It turns out that the cells and proteins in your blood determine the amount of fluid that stays in your blood vessels, and how much is pushed out to soft tissues. Your proteins are out of whack, so your body pushes fluid into your ankles, swelling them, and painfully engorging your muscles. The socks provide a temporary reprieve squeezing the fluid around.

Now that you’ve made it to the clinic you have a very important decision to make. You enter the building on the ground floor. There are two ways to get to the clinic one floor above. You can take the elevator (the blessed blessed elevator), or you can take the staircase in the atrium. The physical therapist who deserves sainthood for the evil glares she endures with a smile from you, has made it very clear that if you don’t do some exercise, you will have real trouble in walking. You will hear her voice in your head sweetly telling you that stairs are the best exercise you can get right now. You will also see in the face of your beloved caretaker, that she is hearing that voice too.

The stairs are split in two by a merciful landing. The first half is easy…you will only need to rest here for a minute or two. However, and this is very important, at the top of the second half, to your right is a bench. Do not stop until you are sitting on that bench. It is very important, because at the top of the stairs you will be suffocating, and don’t want to fall down the stairs. Suffocating is not an exaggeration. The deep gasps for air, the empty feeling in your lungs, and the panic you are feeling is real. Without enough red blood cells to take oxygen molecules to your brain and body, you can fill your lungs as many times as you like, but it will not make a difference. You might as well be drowning.

But, it will pass. You will stand, you will walk, and you will make it the 100 feet or so to the treatment area. The nurses (the blessed blessed nurses) will draw vials of blood from an external central line that leads from outside of your body, through your chest, up to your neck, and then down to a point just outside the heart. Then you start to hope for less than 8. It is not always a hard rule, but in your mind you are hoping the hemoglobin count is 7.9 or lower. 8 is the threshold for a transfusion of platelets – red blood cells. These are the cells you need to breathe. These are the cells you need to keep from fainting. These are the cells that will keep you awake for more than three hours at a time.

Without transfusions of red cells and platelets in the days after a bone marrow transplant, you die. Without these transfusions in the weeks after the transplant, when your new marrow is growing, you may not die, but you won’t enjoy living.

Today, from 12-6 at the Holy Cross Church at 4112 East Genesee Street in Syracuse there is a blood drive. The blood you give today may help a bone marrow patient like me. It may help several infants in hospitals around the area. It may save a life of a patient in surgery. No matter who it helps, it will mean that a father or mother or child or grandparent a cancer patient a hemophiliac a friend or a lover will live. Please consider giving a gift you hopefully never have to think about to a person or who can think of nothing else.

Thank you.

Live in Central New York? I Need Your Blood

On June 4 from 12-6 is the 6th Annual Lankes Family Blood Drive at Holy Cross Church in Syracuse, NY. Ready access to blood has saved my life through my treatments for cancer. One pint of blood can save the lives of several people. Summer is also a high need time for blood donations.

There are spots still available for donors. The Syracuse community has been so supportive of me and my family. I am asking you to extend that love to cancer patients across the area.

Please call 1-800-RED-CROSS or visit RedCrossBlood.org and enter: LANKES to schedule an appointment. Also, there is good chance you can get a $5 Amazon gift card!

Librarianship in an Era of Big Data: The Vital Human Touch

“Librarianship in an Era of Big Data: The Vital Human Touch.” Conference of European National Librarians. Mo i Rana, Norway. (via video conference)

Speech Text: Read Speaker Script
Abstract: In a world of AI and Big Data the values, skills, and mission of librarians is increasingly vital. How do we prepare our professionals to guide and support communities across the EU? How do we ensure the smart citizen is at the center, and in control of the smart city? It is crucial that librarians advocate for issues of privacy and the common good in the midst of a growing market place that transforms users into products. How can librarians increase their value, work with technology giants to shape services, and in the end help our communities make smarter decisions and community members find meaning in their lives.

[This is the script I used for my talk. I’ve also taken the opportunity to add some foot notes and links.]

First, I must begin with an apology to Cecile and the organizers. For weeks they have been asking for slides. I have struggled to put into a concrete form what I want to say to such an important and, frankly, intimidating crowd. I have also spent a year wrestling with cancer and a bone marrow transplant. I don’t say this to get sympathy (maybe a little), but I have found that the experience has given me a new perspective on things that I am still trying to integrate. At once, I am grateful for just about everything, but tend to be less patient with things I feel need changed. And to be clear, as I look out at my home country and across the Atlantic, I see some need for change.

This crystalized for me when preparing for the talk and kept coming into the phrase “memory organizations[1].” If indeed Santayana was right that “Those who cannot remember the past are condemned to repeat it[2],” then what obligations does that place upon institutions who are charged as memory keepers of nations? I would argue that as stewards of cultural heritage, we are also stewards of society. I would also argue that we must stop serving communities, and start building them.

Our communities, our societies, our cultures are too important to sit on the sidelines and simply observe or collect their output. We must recognize, without the haze of nostalgia, that we are actors in this world and accept a responsibility to work directly with communities of all types to shape a better tomorrow. 

Now this could, and should, take many forms. Confronting growing economic disparities, alerting the world to the dangers of xenophobia mixed with nationalism, or confronting the realities of climate crisis. I could talk about the continuous marginalization of whole segments of the population because of race, or class, or sexual preference. Marginalization by society, and indeed, all too often by librarians and the libraries they manage.

Today, however, I would like to use just one societal issue to support a call for national libraries to directly build communities. This incredibly consequential issue often goes unnoticed. It is the dangerous aspects of an increased societal reliance on data-driven algorithms generated by artificial intelligence and machine learning methods. 

Before I begin, let me throw out a few caveats and clarifications. The use of data, when appropriately gathered and analyzed is incredibly powerful. Indeed, it underlies most of science. The rise of artificial intelligence, machine learning, and indeed Big Data has unquestionably brought massive benefits to many disciplines. The ability to search through trillions of pages in milliseconds, search across massive number of images, and the ability to automate complex processes have directly benefited librarians. The issue, as I see it, is when we believe that data gathering, analysis, and encoding into algorithms are somehow neutral acts without social costs[3].

In terms of clarifications; I will use the term community a lot. That is often seen as synonymous with towns, or citizens of a city. I use the term more broadly. When I say communities, I mean a group of people joined by some known variable and that share a means of allocating limited resources. A town is indeed a community sharing a common location, and a system of governance that allocates land, taxes, and other resources. A university is a community of scholars, staff, students, and administrators. A community can be a law firm, or a hospital, or a national library.

The other caveat is on the topic of national libraries. I have done some work with national libraries, but much of my relevant experience is working with state libraries here in the US. There is an old joke that once you know one state library, you know…one state library. I don’t pretend to be an expert on European National Libraries. However, from what I’ve seen, this is true of your institutions as well.

Some of you are active in networks with public and academic libraries. In some countries, there are multiple national library agencies. Some actively seek to support business communities[4], others focus on scholarly research. Bottom line – no one set of issues or models will reflect your great variety.

That said, I am reminded of the saying, “every idea is a good idea in libraries, just not in my library.” It is very easy to focus on what differentiates us now and believe that prevents collective action in the future. I hope I can successfully persuade you otherwise

So, with the caveats and clarifications out of the way, my purpose today is to recruit all of you to build a network of proactive librarians around European. I am calling on you to directly support, train, and empower librarians from those working in the most rural public library to those in the most prestigious university. What’s more is that I am asking you to engage in community building to shape a better future. Why? Well, let’s start with a story.

Charles Duhigg, author of “The Power of Habit[5],” tells the story of an angry father who storms into a department store to confront the store manager. It seems that the store had been sending his 16-year-old daughter a huge number of coupons for pregnancy related items: diapers, baby lotion and such. The father asks the manager if the store is trying to encourage the girl to get pregnant? The manager apologizes to the man and assures him the store will stop immediately. A few days later the manager calls the father, only to find that the daughter was indeed pregnant, and the store knew it before she told her father.

What’s remarkable is that the store knew about the pregnancy without the girl ever telling a soul. The store had determined her condition from looking at what products she was buying, activity on a store credit card, and in crunching through huge amounts of data. If we updated this story from a few years ago we could add her search history and online shopping, even her shopping at other physical stores. It is now common practice to use online tracking, wifi connection history, and unique data identifiers to merge data across a person’s entire life.

I am hardly telling you anything new here. Facebook is only the latest business to dominate the headlines with privacy breaches, and hidden data gathering. Most citizens of the EU and the US now live two lives: their own, and one created, often without their knowledge, from the digital debris created through our devices. Add to this increased requirements by governments and businesses alike to be online – to apply for a job, to vote, to receive health care, to listen to music – and we see a world that is moving faster than regulation, and faster than realization by those we seek to serve.

In Toronto, Sidewalk Labs, a subsidiary of Alphabet, Google’s parent company, is working with town planners to redevelop Toronto’s Eastern Waterfront. The story of transforming old industrial areas into gentrified multiuse spaces is nothing new. However, a large part of the controversy in this case comes in the plan to make the new neighborhood a data generator. The plan, according to The Intercept, “includes a centralized identity management system, through which ‘each resident accesses public services’ such as library cards and health care[6].” There has been a large debate over who owns and controls the data generated by that system, and who can profit from it.

Many librarians might look at these examples and claim a sort of ethical high ground. After all, as a profession, we explicitly value privacy. In the US we count it as a core value of the field, and yet we often undermine it. We tell our online patrons that we don’t track their work. And yet their internet provider can indeed track every click they make. Therefore, we are often misleading that patron and giving them a false sense of security. How many libraries set up TOR servers[7]or anonymizing VPN services[8]for our service populations? How often in our licensing of databases or other software do we explicitly forbid the aggregation of user data or the selling of that data? How often do we check on those terms?

Then there is the question of how all of this data is used.

In her book “Weapons of math destruction[9],” Cathy O’Neil documents story after story of data mining and algorithms that have massive effects in people’s lives, even when they show clear biases and faults. For example, an algorithm that led to outstanding teachers being fired. How? O’Neil writes about an outstanding teacher who had proven positive effect on under-performing students– raising their performance and grades significantly. As a reward, the teacher is given a year with honors classes filled with the brightest students in the school. However, the impact a teacher can have on honors students is not nearly as evident as those with students needing a lot of help. After all, top students receiving top marks can’t get better than, well, top marks. So the algorithm saw a teacher that was no longer effective in a classroom, and recommended the teacher be fired. Recommended by a piece of software using criteria that was hidden from teachers, and was assumed to be objective.

Algorithms are now used here in the states to determine health care cost and availability; access to credit for home ownership; suitability of a candidate for a job,; and even how long a person should be in jail.

Yuval Harari refers to this reliance on collectable data and algorithms as Dataism[10]. It is the result of computing power combined with machine learning and the wide availability of constantly connected devices like our phones. It is the belief that if you gather enough data on a person or situation, you can accurately represent that person or situation and predict an outcome.

It often also comes with some very dubious, and downright dangerous assumptions. Assumptions such as algorithms are objective, and that data collection is somehow a neutral act. Or even, that everything can be represented in a quantitative way – including, by the way, culture[11]. And before I make you wonder what any of this has to do with the work of libraries, or think I’m letting our profession off the hook, I have to say that librarians have suffered from some of the same dubious assumptions.

For too long librarians and library science educators saw ourselves as neutral actors. We collected, described, and provided materials believing that these acts were either without bias, or that those biases were controlled. In collecting we took it all…except for works that were self-published, or from sources we deemed predatory or of low quality. In cataloging we relied on literary warrant and the language of the community – often ignoring that we only saw the dominant narrative and voices. Our services were for all – from 9 to 5 with a researcher’s card who could travel.

We as a profession are now waking up to the fact that we are a product of our cultures – good and bad. We understand that the choices we make in everything from classification to exhibits are just that – choices. They may be guided by best practice, or enforced by law, but ultimately, they are human choices in a material world where resource decisions must be made. We can speed up digitization with newer machines, but we still have to pick a starting point. We can expand those we serve on the web, but still must acknowledge that there are people with no broadband or connectivity.

Now it would seem like this may turn to a call to redouble our efforts in neutrality. A call to wipe away the biases in ourselves so that we can confront the cost to society of skewed machine learning efforts. But it is not. In fact, we must embrace that libraries, and the librarians that build and manage them, are biased[12]. What’s more, it is only by seeing libraries as biased that we prove our value in the world of massive scale data.

First, we must realize that it is impossible to be neutral. Putting a book on a shelf or in a vault is a choice. Every day in archives and special collections we make professional determinations of how accessible an item is versus how protected it is. We can seek out many voices and yes, gather data, to make those decisions, but in the end, they are decisions with consequences. Pretending we are neutral doesn’t change the consequences, it only allows us to pretend they are not the result of our action.

Now, I keep calling them biases, but a better word would be principles. Principles are an explicit statement of belief. They should be transparent and, most importantly, able to be assessed. Are we following our principles? 

And make no mistake principles are not neutral. Seeking to serve all equitably takes effort and resources. Choosing to provide images for fee or free is a choice. Fighting censorship is a decision. If you don’t think so, try balancing it against issues of hate speech and threats to marginalized communities.

It is in our decisions and our transparency in making those decisions that we build trust with our communities. Our scholars, and entrepreneurs, and citizens don’t trust librarians because we are neutral, but because they agree with our principles and see them consistently applied. The days when libraries had the monopoly on access to large collections is well over. Yet libraries in most places the world are not only in use, but in growing use – public, academic, school, and national libraries alike.

Where library use (not necessarily support, but use) is growing it is because we are seen as accessible, equitable, and trusted. Yes, the collections we hold are valuable. The fact that we hold unique resources that either haven’t been, or can’t be digitized is important. But it is only important if those who seek out these resources trust us to be honest stewards of these resources.

It is our embrace of our humanity – our human touch in an increasingly automated system that underlies our value. This is not a luddite’s call against technology, AI, or machine learning. Rather it is a belief that human connection – community- is more important than ever when the face of government and business alike become web pages and bots under the banner of austerity or efficiency.

The future of libraries is ultimately not set by which technologies are developed or deployed. It is not in a value that was defined a century ago. It is in our very human ability to build trust with our communities. It is upon that trust that we build support. It is upon that trust that we build use. It is upon that trust that we find and confirm our necessity.

It is with that trust that we must reach out to the computer science community, the online industry, and the governments collecting data and deploying algorithms. We must advocate for a seat at the table and represent the voices of those without a seat. We must use the hard lessons we learned and are still learning in issues of diversity, equity, and inclusion to help guide these technologies. We must be trusted by our communities to speak truth to power and to give those communities power to speak for themselves. We must follow our principles in actively shaping, with our communities, the policies, regulations, and laws that talk about data. 

National libraries must play a large role in civic data stewardship. National libraries must not just safe guard the heritage of cultures, but the privacy and intellectual safety of citizens. You need to be a memory organization in the realization that effective memory is both about remembering, and, forgetting.

How do I resolve the paradox that I just advocated for a common role in institutions that I also acknowledge are so diverse? If indeed, you each represent unique institutions, does this preclude collective action? Of course not. Because in effect you are what all libraries must become. Every library- public, academic, school- should be shaped to the communities they serve. Then, as librarians, we become the connective tissue that seeks the best of all libraries and shape those innovations to local needs. Gone are the days when every library looked alike or supported some cannon of common services. Gone are the days when best practices extend to all libraries of a given size or type. Throw away the toolkits and instead build a toolbox[13].

We must prepare our librarians, regardless of title, or training, or location to be a missionary force proactively engaging in the well-being of our cultures and communities. We must build national peer networks that rapidly and effectively spread ideas and help librarians effectively shape them when they meet local needs. These networks discard best practice and industrial standardization for conversation, learning, and adaptation. We must connect the best thinkers together regardless of status or institutional boundaries.

How do important national institutions do this?

We must create platforms for continuous engagement of librarians where they can share, learn, teach, mentor and support each other. This may be built upon and with national and regional associations, but the focus is on individuals, not institutions.

We must create a system to formally recognize participants within and beyond this platform. Work with library science programs where they exist, but also extend the recognition beyond formal degrees to continuous learning.

We must recognize Lighthouse Libraries[14]that embody innovation and serve as inspirations, not blueprints, for other libraries.

We must proactively engage this network of change agents to transform libraries, associations, institutions, and ultimately communities globally. Members of our peer networks, our communities of practice, must encounter daily new ideas from across the globe.

Think of a library as movement, not a place or an institution[15]. It is a movement of people committed to improving society. Librarians, certainly. But also, scholars, politicians, entrepreneurs, programmers, and authors. Discard terms like users that reinforce the idea that our communities are consumers, and our only value is in the utility we provide to a demand. We have members and citizens; neighbors and scholars that all own and shape the library.

Most importantly, this will not happen in one hour of a conference. It will take more engagement, more experimentation, and more investment. That is why I support the PL2030[16]project. Building off of the Public Libraries 2020 project, it is a group of librarians from across the continent seeking to transform public libraries across Europe one librarian at a time. It advocates for libraries and builds connections between elected representatives and library innovators. But it needs help. 

PL2030 and the work of its members represent the need for a new vital link between the cultural heritage mission of National Libraries and public libraries. Libraries are transforming from access points, collections, and information providers into community hubs across Europe. From Manchester to Cologne to the amazing Dokk1 in Aarhus to Delft and Tilburg in the Netherlands and Pistoia and Perugia in Italy public libraries are the places communities come to learn, create and dream together. Here, by the work of innovative librarians, libraries have gone from quiet places of retreat to loud places of engagement. The true collection of a great public library is now the community itself. Blacksmiths and bakers host conversations. Librarians lend out books and musical instruments and recording studios. Rather than bringing the world to the community, these libraries have become loudspeakers broadcasting the community to the world. These public libraries have become the cradle of cultural creation.

As institutions charged in part with preserving and supporting the cultural heritage of a people, you need to preserve and support the work of these institutions. Not simply as a backup or for posterity, but as part of the living and breathing centers of community conversation. In a connected world – connected through technology certainly, but also in trade, in governance, and in preserving the earth itself – There is no more front-line service and library of last resort. We librarians are obligated to serve all, and in your nations now is a network of libraries eager for your partnership.

I thank you for your time, and I look forward to the conversation to come.


[1]Or “Memory Institutions” like the CENL Strategic plan https://www.cenl.org/wp-content/uploads/CENL-Strategy-2018-2022_final-1.pdf

[2]https://en.wikiquote.org/wiki/George_Santayana

[3]I love Chris Bourg’s take on the use of “societal cost” in discussing AI versus ethics.

[4]I’m a big fan of the British Library’s https://www.bl.uk/business-and-ip-centre

[5]http://www.worldcat.org/oclc/881631924

[6]https://theintercept.com/2018/11/13/google-quayside-toronto-smart-city/

[7]https://www.torproject.org/

[8]There are plenty of good articles explaining VPN. Here’s one that actually compares VPNs vs Tor: https://www.cloudwards.net/vpn-vs-proxy-vs-tor/

[9]http://www.worldcat.org/oclc/1039545320

[10]http://www.worldcat.org/oclc/1060991037

[11]This article certainly doesn’t claim that all of cultural heritage can be represented quantitatively. Rather I include the citation because it is a good introduction to the use of quantitative analysis of some cultural material and because it includes the very cool term Culturomics, “Culturomics is the application of high-throughput data collection and analysis to the study of human culture.” https://www.librarian.net/wp-content/uploads/science-googlelabs.pdf

[12]Here’s a good place to start on the discussion of libraries, librarians and neutrality:https://americanlibrariesmagazine.org/2018/06/01/are-libraries-neutral/In particular check out Emily Drabinski’s take.

[13]Pithy phrase, but in case it is not clear I mean stop sending out assembled ready to implement toolkits and focus on librarians gaining the tools to develop their own programs and/or create local application of programs customized to their communities.

[14]https://publiclibraries2030.eu/projects/lighthouse-libraries/

[15]Stole this idea from the amazing Marie Østergaard: https://podcasts.apple.com/us/podcast/princh-library-lounge-ep-3-building-global-networks/id1451326347?i=1000437039135

[16]https://publiclibraries2030.eu/

On Twitter, Movers & Shakers, and Library Heroes

[I’ve edited this post to remove a pointer to the tweet that set me off. It was pointed out correctly that the way my thread was a response to a person instead of making the point generally was unfair to the original person who tweeted].

So today I kind of went Twitter crazy and I thought it would be useful to share my thoughts for folks who are not on Twitter. Particularly for those who have left Twitter because it seems to a place where negativity rules in library land.

It started with a tweet about people who were recognized as Movers & Shakers by Library Journal. There has always been a love hate relationship with this list. The primary complaint is that there are plenty of folks who innovate and do fantastic work in librarianship not recognized on that list.

Here’s my reply to one of those posts (with a little editing for typos):

Know several who are still doing good work. I’ve never understood the need to tear down folks who get recognized in the profession. Just because they get called out doesn’t depreciate the great work everyone else does.

You don’t like the project, fine. The person makes bad professional choices, call them on it (as many have done with me with very legit issues). But we are a profession that is supposed to be about service and community support.

Some of these folks have been my students. Some my colleagues. Some friends, and many I don’t know. Still they are people trying to do good work…like we all are.

I love the question, have they burned out. Not because I hope they have, but because it begs the question how can we as a profession ensure folks don’t.

If we make recognition, no matter how dubious we may feel the source, a factor in burning out, what does that say about us as librarians?

On that list are there folks that seek out recognition? Sure. Are there people who flaunted the award? Sure. There is also a cancer survivor who risked her job to fight for labor rights. A school librarian who every day supports the most vulnerable students.

A professor who lost her home in a hurricane and still everyday advocated for broadband in too often forgotten rural populations.

A librarian in the deep south bring a community together to have a serious conversation on race.

We talk about libraries transforming and becoming community hubs. As a profession we too are a community. That means we have issues that need to be addressed frankly like race and inclusion.

That only happens if we feel safe in talking. Confront me when I screw up. Many folks have. It hurts, but when I deserve it, it should hurt. But don’t confront someone because someone else pays them respect.

I’ve talked to some of those Movers and Shakers after the award and many feel it’s like a scarlet letter. That makes me sad. We as a profession that proclaims improving society should feel joy for the recognition of others.

The elevation of one should be the elevation of us all. I’m tired of flinching every time I open Twitter. I want it to be a place that I learn from. Even if that is learning from my mistakes. It shouldn’t be a place we run from fearing our backs.

To be clear Twitter can be a powerful force in librarianship. It has called out the profession on powerful issues like race, hate speech, neutrality, fair labor practices, and vocational awe. It has been a place where serious topics like the philosophy underlying the field have been discussed. It has and can act as a sort of shared LibGuide on important topics. It is also a place where ad hominin attacks happen.

If we seek to facilitate knowledge in our communities – giving voice to a community with equity, we need to practice this among ourselves. This isn’t a call for civility in the sense of suppressing negative views or serious disagreement. It is a call for using platforms to invite fierce and powerful conversation that doesn’t punish people for sharing their ideas.

Then I decided to share some of the librarians and supporters of librarians that have been so vital in my career using the hashtag #MyLibraryHeros (yes, I now realize I misspelled heroes). As I said in a tweet “This list is only a start, and if I’ve misspelled a name or not listed you (yet), know that you matter.”

#MyLibraryHeros Just a few of the folks who I admire and learn from in library land. Most librarians, several “librarians by spirit.” 

Nicolette Sosulski, Todd Marshall, Angela Usha Ramnarine-Rieks, Heather Margaret Highfield, Jessica R. O’Toole, Xiaoou Cheng

#MyLibraryHeros Jocelyn Clark, Amy Edick, Elizabeth Gall, Nancy Lara-Grimaldi, Michael Luther, Kelly Menzel, Andrea Phelps, Jennifer Recht, Sarah Schmidt, and William Zayac, Scott Nicholson, Justin Henke, Meg Backus, Sari Feldman, Gina Millsap

#MyLibraryHeros Joanne Silverstein, Blane Dessy, Keith Stubbs, Linda Johnson, Sandra Horrocks, Jeff Penka, Susan McGlamery, Paula Rumbaugh, Gabrielle Gosselin, Agnes Imecs, Melanie Huggins

#MyLibraryHeros Kathryn Deis, Mary Ellen Davis, Jenny Levine, Anne Craig, Gwen Harris, Mike Eisenberg, Chuck McClure, Ray vonDran, Joe Janes, Eli Neiburger, Jill Hurst-Wahl, Mary Ghikas, George Needham, Joe Ryan, Megan Oakleaf, Blythe Bennet, Buffy Hamilton.

#MyLibraryHeros Marie Radford, Joann Wasik, Pauline Shostack, Holly Sammons, Rivkah Sass, Stephen Bell, Stephen Francoeur, Donna Dinberg, Franceen Gaudet, Karen Schneider, Joan Stahl, John Collins, Linda Arrett, Nancy Morgan, Melanie Gardner, Buff Hirko

#MyLibraryHeros Caleb Tucker-Raymond, Nancy Huling, Jane Janis, Joyce Ray, Bob Martin, Tasha Cooper, Mary Chute, Kathleen Kerns, Mary Fran Floreck, Kate McCaffrey, Lorri Mon, Marie Østergaard, Neil MacInnes, Erna Winters, Erik Boekesteijn, Ton van Vlimmeren, Ilona Kish

#MyLibraryHeros Chris Bourg, Emily Drabinski, Nicole Cooke, Emily Knox, Kathleen de la Peña McCook, Kelvin Watson, Lynn Connaway, Sally Pewhairangi, Jereon de Boer, Jorge do Prado, Anna Maria Tammaro, Sue Kowalski, Jan Holmquist, Andromedia Yelton, Liz McGettigan, Jessamyn West

#MyLibraryHeros Mia Breitkopf, Chelsea Neary, Loranne Nasir, Emma Montgomery, Lauren Britton, Sue Considine, Beck Tench, Cheryl Gould, Kimberly Silk, Wendy Newman, Lane Wilkinson, the entire South Carolina SLIS faculty and staff, Tamara King, Patrice Green, Jack Bryan

#MyLibraryHeros Ron Stafford, Robert Wedgeworth, Stuart Sutton, Courtney Young, Barbara Stripling, Carla Hayden, Peter Bromberg, Scott Walter, Sara Kelly Johns, Dave Tyckoson, Branwen Rhiannon, Cecilia Preston, CLiff Lynch, Dan Barron

#MyLibraryHeros Jim DelRosso, Mary Chute, Brian Dawson, Char Booth, John Chrastka, Patrick Sweeney, Heather Braum, Dustin Fife

#MyLibraryHeros Sarah Houghton, Ned Potter, Megan Oakleaf, Roberto Delgaillo, Rolf Hapel, Sarah Inoue, Bonny Ryan, Vickery Bowles, Miguel Figueroa, James Neal, Sandra Toro, Sandy Hirsch, Micheal Stephens, Andrew Bullen, Linda Smith, Liz Liddy, Kate Marek, Karen Snow, Hassan Zamir

#MyLibraryHeros OK, a partial list at best, but I need to take a shower and get on a phone call. Who are your library heroes?

Thank you all for shaping my view of the world. This is only a partial list. Please take it as an opportunity to share your list, however partial it might be.

Farewell to My Dear Friend Nicolette

Nicolette Sosulski

Today the world lost a great librarian and I lost a great friend. Nicolette Sosulski passed away from a long and brave journey with cancer. 

She was an advocate for the profession and a great teacher. She cared deeply about serving those in need. She was also a tireless advocate for librarians with a belief that we could all get better. 

I have no higher praise to give than to say she was my librarian. If I had a hard question or wanted to check a crazy idea she was there. Beyond professional, she was dogged, passionate, and a genius navigating sources and working with people. When I talk about librarians advocating for our communities she is the face I see. 

We began this latest road with cancer together. She was always there checking in and sharing. Even as her options dimmed, she was there cheering me on. She was clear eyed about what was coming and had a strength I cannot fathom. 

Today I miss my friend.

UPDATE: her family has posted this obituary:

On May 15, 2019, Nicolette Marie Warisse Sosulski, loving daughter, sister and mother of two children, Peter and Nicholas, passed away from a year-long battle with lung cancer at the age of 56.

Nicolette was born on March 20, 1963 in Louisville, KY to Nicholas and Doris Warisse. She grew up in St. Martha’s parish and graduated from Sacred Heart Academy. She earned her Bachelor of Arts in English from Georgetown University, Washington, DC and her Master of Library and Information Science from the University of Washington, Seattle, WA. Nicolette found her passion in librarianship.  The American Librarian Association, in their memorial resolution to her said, “She was a motivated information professional: relentless about investigating a research question, had a gift for connecting people to information, had a real knack for identifying areas of need and developing plans, Nicolette was all that a 21st century librarian should be.”  Nicolette loved all things Southern which included cooking, literature, and big floppy hats.  She had a passion for animals, rescuing over a dozen dogs and cats, either finding them homes elsewhere or taking them in and loving them herself. Nicolette was an award winning author, a witty and phenomenal online communicator – she took online messaging and managing connections to a whole new level.  Nicolette was enthusiastic, mindful, continuously curious, loved and WAY smarter than anyone else in her family.

Nicolette is survived by her father, Nicholas and her mother Doris, her two children Nicholas and Peter, her sisters Jeanine and Michelle, and her cousins, nieces and nephews.

In lieu of flowers, memorial donations can be made to the Portage District Library, 300 Library Lane, Portage, MI 49002.  

A mass celebrating her life will be held at St. Joseph’s Catholic Church, 936 Lake Street, Kalamazoo, MI, 49001 at 1:00 in the afternoon on Thursday, May 16, 2019 with a reception immediately following. The family is also planning a celebration in Louisville, Kentucky that will take place over Fourth of July weekend.