Bias in Algorithms with Swati Gupta

Employers today are eager to harness the artificial intelligence (AI) and big data captured by the algorithms to speed up the hiring process. But depending on the data used, automated hiring decisions can be very biased.

Our guest is Swati Gupta, a professor and researcher in the School of Industrial and Systems Engineering at Georgia Tech. She’s an expert in all things AI.

https://api.spreaker.com/v2/episodes/23255170/download.mp3

machine learning

artificial intelligence

algorithms

bias

Host

Steve W. McLaughlin

Georgia Institute of Technology

Provost, Georgia Institute of Technology

Professor, Electrical and Computer Engineering

Guest

Swati Gupta

The H. Milton Stewart School of Industrial and Systems Engineering

Assistant Professor

Personal Website

Audio

Audio & Captions

Download Audio File

Transcript

[electronic music]

>> We're going to go on an excursion into artificial intelligence.

Steve McLaughlin: Employers today are eager to harness artificial intelligence and big data captured by algorithms for speeding up the hiring process. But depending on the data used, automated hiring decisions can be very biased.

[steam whistle]

[applause]

[marching band music]

I'm Steve McLaughlin, dean of the Georgia Tech College of Engineering, and this is The Uncommon Engineer.

[music]

Man: [archival recording] We’re just absolutely pleased as punch to have you with us. Please say a few words.

[applause]

Steve McLaughlin: Welcome to part two AI series on bias and fairness in algorithms. Today, our guest is Professor Swati Gupta, a professor and researcher in the School of Industrial and Systems Engineering here at Georgia Tech. She's an expert on all things AI. Welcome to the program, Swati.

Swati Gupta: Thank you, Steve. I'm happy to be here.

Steve: So, you know, today you can't watch TV or listen to the radio for more than 10 seconds without someone talking about AI and machine learning and all the things that that are coming and self-driving cars and all that. So I can't wait to get into the discussion we're going to have about where those algorithms can go wrong. But before we do that, could you say a little bit about your definition of how you see artificial intelligence, and what would you say to someone who just hears the word but really doesn't know what it is?

Swati: So artificial intelligence to me is just a method that finds a pattern in the data or finds a part on in what we do and then emulates that pattern and makes decisions based on that.

Steve: So one of things that you mentioned was data. And so data has to be central to AI and making sure that it works. Can you say a little bit more about that?

Swati: So I said data, but I think we can broadly also think about algorithms as a part of AI. And to me, algorithms are an optimization method. It's all in that same bucket. You know, suppose I wanted to read a text and I want summarize it, then I could have an algorithm that picks out sentences that look very different from each other and create a summary by picking out like 10 sentences. And then there are methods to make sure that the sentences are very different from each other and they accurately capture what's being said. So that process to me is artificial intelligence because it's giving you a summary of an article. So it's an automated method that takes into some data and gives us something from that.

Steve: And, you know, we here— the things that we hear about artificial intelligence are generally pretty positive, you know, artificial intelligence to help with Netflix and Amazon and all of those choices that we're presented with all the time. Well, but there's no one behind those algorithms, you know, they're all running automatically. And I know I sometimes worry that some of these algorithms can go wrong. Certainly they're giving me wrong product recommendations and movies sometimes. I know that your research is in this general area. Can you say a little bit about your research and the kinds of things that you work on?

Swati: So we've been thinking about— when you see algorithms can go wrong, maybe the definition of “wrong” could have various meanings. It could go wrong because it's giving an errored prediction. It should not have predicted something on a particular example, but it's saying yes when it should have said no. And that's kind of what we expect, and we are trying to minimize those errors.

But sometimes algorithms can go wrong in unintended ways or unexpected ways. And that's been one of the focuses of my research since the last one and a half, two years. And we call these “socially conscious algorithms.” So when they go wrong in the sense of making biased decisions against [indistinct] a group of people, we are investigating that deeper and trying to maybe correct them or understand why did that even happen in the first place.

Steve: Can you— do you have a definition for “bias”? Can you— how do you— what do you see as bias, and then how would you incorporate it in your work?

Swati: I don't have a very precise definition for bias, unfortunately, but there are different aspects that we often think about. And there's a lot of research going around into defining what bias is. What I've realized is that, you know, it means different things in different applications.

But typically we look for something that is historically discriminatory, something that is systemic, and something that seems unfair. Now, if I say that I'm going to charge people who are challenged more for accommodation because that’s just a higher cost for me, it seems unfair morally. If I say that women have breast cancer more frequently than men and my algorithm is sort of predicting that, that's fair because that's just a trend and it's a justifiable trend. So whether a correlation is justified or not justified and whether it should be there or not, like do we want s society progressing in a way that we need that— we want that— correlation or we don't? I think that that might be a sort of a how maybe we define bias.

Steve: And I know that one of the areas that you have worked a lot in is in the area of bias as it relates to hiring practices. Tell us more.

Swati: So I really believe that humans and algorithms are both biased when it comes to hiring, but with algorithms we have an opportunity to fix them. And when a human or a recruiter sees 10,000 applications and they have to take decisions in a day, there is no way they can go through 10,000 applications. And so what they might say is that, you know, “I’m going to have certain thresholds or I'm going to have certain rules in which I filter out these résumés before digging into them deeper.” And that's exactly what an algorithm or an AI system might do as well. It might look at prior data of who was hired and who was not, who performed well in the job and who is not, and create a rule or a mechanism to filter out résumés and maybe select the top 50 or the top 10 résumés to give to a human hiring committee, and that's what a recruiting manager might also do.

So what we're looking into is that if your data was already biased to start with; your company was has already made hiring decisions in the past 100 years in a particular way, and you sort of want to break out of that and you want to understand that deeper and you want to move towards maybe having a more diversified hiring portfolio, how should your algorithms change and adapt to that?

I think what we actually want our algorithms to do, or even recruiters to do, is to hire people who have the right skill for the job, right? And how do you define that— “right skills”? So there may not be any right way of measuring that. If you ask somebody, you know, “Do you want somebody with a lot of work experience or with a lot of initiative?” What's the formula? Is it 0.6 times the work experience plus 0.4 times initiative? I don't think any employer can give me that formula, right?

And so what a machine learning algorithm might do is say something like, let's say, “Hire this person,” see how that person did, and then take that into account for hiring the next person. If that person did well, they will hire more people like that first person. If they did not do well, they would hire less people like that first person. So an algorithm can be made to be learning online. It learns from the decisions in the past.

Steve: And so then I think it's pretty clear then how the bias starts to creep in because, like you said, you might just want someone who has skills and is very productive. And it may turn out that the data that you're also collecting has to do with their gender or other ethnicities or other things that may or may not be so relevant at all.

Swati: So there are studies which show that, you know, if you scrape out the name of the person, then even then there's a bias and it can very well figure out which gender group they belong to. There might be a bias in the way the résumé is written. Maybe there's a bias in proposals. In NSF proposals, apparently men use different words that seem more attractive and they're funding the proposals. There are some companies that have started actually going beyond these numerical scores and proxies, and they have a test. So they have people code if that's the thing they're looking for. And whoever can pass that coding exercise is hired.

And so I think that's an interesting way to get to that core quality that you're looking for in the person. So let me give you another example, actually. There were these violin auditions and they used to have lesser number of females and more males who were hired. And then they started having a curtain, and the violinists would perform behind that curtain, and that balanced the gender gap in music schools. So I think it's very interesting.

So I was doing this experiment in my class where I asked my students to hire candidates. And they would see a name and they would see a work experience and initiative and just a score for that, and they would have to hire people. And I was talking to them about bias in the data and how maybe they should think about different group memberships and hire across. And some students said, “OK, but what about the impact on the majority group? What about the impact on the group that have actually excellent résumés? And now you're telling us to look at somebody who does not have such a great résumé and make some leeway for the bias?”

It's an interesting balance that we don't want to discriminate against one group. But in order to fix that, we don't want to create biases for the other group as well. So we have this hiring algorithm that I spoke to you about. And now we are developing a test which the first half people hired based on the résumés or the data they see then give them a bias training and then see how the results change. So really bring the mathematics back into a human behavior and study if we can affect any changes, so not just stick to their algorithms, but learn both for humans and educate back and forth.

Steve: So far, we've been talking about not gathering enough data that would maybe cause bias in these algorithms. Is there some kind of analogous description about the algorithms themselves? It's not the data that produces bias, but is it things about the algorithms that can produce bias? Or is it more of a data problem?

Swati: So I've been thinking about what is in the algorithms that can produce bias. But as far as I believe that people don't write malicious algorithms , I really think that it's because of what the algorithm is basing the decisions on. What we are trying to do as engineers or machine learning researchers or AI researchers is to create a view of the world through all of these data and all of these variables, and the algorithm only gets a slice of the variable. So if you don't get the algorithm that, oh, you know, this student and went to this school before they applied to college and this student went to this school before they applied for college and they had these resources, but what is the impact of having that? If we cannot translate that to an algorithm, then the algorithm cannot do anything about it. And again, it could be the same problem of learning a trend in the data.

I was at the Simons Institute and there was a workshop on fairness. And one of the sentences I remember, one of the remarks I remember from one of the speakers was that flipping race as a variable does not give you a counterfactual. You have to dig deeper into that to understand what would have happened if that variable was flipped. So you cannot flip gender. So we have this entire theory of causality, and we have this entire theory of causation of variables, but I think we just need to think deeper into what is the process.

Steve: So in the example you just gave—

Swati: I don’t know if that was clear or not.

Steve: No, it was. You talked about of flipping gender or flipping race, And I think, again, a lot of people would say— and maybe this is the example he started with— “Hey, let's take race out of this,” or the law might require you to take race out or the law might require you to take gender out. And those things won't eliminate bias. In fact, they might make it worse. Can you give an example around that? Do you have an example?

Swati: So even this this hiring thing, right, suppose I took gender, I took race, and I blinded that data out. The first thing is that other variables might already incorporate some of that information. Now if I look at a résumé that says women's soccer team, I probably have a good idea of what the gender was, right? So some other variables might already have that information. And suppose I remove all of that as well. Then, now I'm just seeing numbers for these people. I see test scores, but I know that the test scores can’t be taken at face value. So but then algorithm would say “I didn't do any discrimination. I took the test scores at face value and I thresholded at 80 percent, and whoever was more than that got in.” But then maybe people who are from a particular community never got points beyond 80 percent. And so it's like even though the algorithm did not intend to create disparate impact, it did create disparate impact because it treated everybody the same, but that's out of the question, right? Like if people have had different experiences, can we treat all of them the same? And that's where I think there's this a tension with the law because discrimination law says you should have— you cannot have disparate impact or disparate treatment. So the law says you cannot treat people differently. But if you don't treat them differently, how do you remove that impact?

Steve: Well, I know you have a sizable research group doing work in this area. Can you talk a little bit both about some of the projects that are going on and then a little bit about the students and what it would take for a student to come into your group?

Swati: As I said, my research is very mathematical. So we've been trying to create a mathematical theory around bias and fairness and formalize and analyze a lot of these issues mathematically. So some of the things that my students have been working on, if you have a city and that city has different workloads throughout, then how do you partition that into different zones so that every zone can have a sub-office that has almost equally-balanced load? So that makes a very interesting mathematical problem with multiple objectives that have to be optimized at the same time.

So this is sort of useful for, let's say, Atlanta Police Department that has different workloads across different areas. And how do you balance the workloads for the police officers without over-policing a certain area?

Then we have another project where we're thinking about where to place an emergency room in a city. And this was motivated by a room that was closing in the Alameda County. And a lot of the population around that [indistinct] would get affected. And so we found in the literature that there were like 25 different metrics of fairness that people have considered. They've already tried to define what is a fair placement. And we have some analysis and we're trying to help the city of Berkeley in identifying which hospital and emergency room can be opened in.

In fact, this project is— it's so simple to explain and comprehend that I spoke about this to fifth graders in Forsyth County and these fifth graders had all the answers. So if you're interested in this area, you don't need to be very experienced or trained in mathematics. I think having the passion for this area is enough.

Steve: So you're saying that you’ve mathematically worked on this problem about where to place an emergency room, and you had fifth graders also kind of work on that problem.

Swati: So all the solutions that we could think about in one year, these fifth graders thought about the solutions in half an hour and were like, “Oh, but why don't you have a stay there for somebody who's traveling from farther off,” or “Yes, definitely. This fairness metric is not good because then this particular population will not get access to the hospital, or “Of course, like something which equally has equal distance from everybody is better than minimizing the total distance of everybody from the populations.” So things that we think about mathematical definitions of fairness, I thought it was very clear to these fifth graders what is it that we call fair. And they had very emphatic opinions about— “No, no, of course you should do this.”

Steve: I think there's a TV show, I'm Not Smarter Than a 5th Grader or something like that. But I mean, I think what's really reassuring about that is that the mathematics and the algorithms that you would develop do mirror the kinds of things that people think about and how they act because as I do mathematics in a completely different way. And I'm really hopeful that it is possible to take the more social aspects and incorporate them into the models. And so do I have that right?

Swati: So that's exactly what we're trying to get at. So we have this hiring algorithm that I spoke to you about. And now we are developing a test which the first half people hired based on the résumés or the data they see, then give us give them a bias training and then see how their results change— so really bring the mathematics back into human behavior and study if we can affect any changes, so not just stick to the algorithms, but learn both for humans and iterate back and forth.

Steve: What would it take to be a student to come into your lab? What kind of background should they have? What kind of advice would you give a student if they wanted to study this kind of a thing?

Swati: The first and foremost is to have passion for this area. The second advice would be to have some skills to work with data sets and to be able to play with the data sets to look for things that have gone unnoticed or that even an algorithm has imbibed, but we don't know off. And the third skill to have would be to think about mathematical proofs because we want to develop a rigorous theory that can look irrespective of the situation thrown at the algorithm.

Steve: You know, different people find their way to engineering in different ways. Some people like to tinker. Some people had a parent. Did you find your way first to mathematics and then engineering or how did you—

Swati: So I actually hated mathematics when I was five, and then when I was in class for physics, teacher had competitions in the class where everybody had to solve 20 sums, and the fastest person would be the winner. And that's what got me interested in mathematics because I would then practice them to be beat the annoying boys in my class. And so then I started getting excited about math, and I went to IIT, which is one of the top institutes in India for engineering. And while I was studying computer science, I actually I'm an artist myself, so I paint. And at some point I was like, you know, I'm going to paint. But then we were taught lambda calculus, and I was just astounded at the simplicity of that, so I was like art can be a hobby. At some point, I did art experiments on the street markets in India and asked people to paint unconventional roles that they could see women playing and created a conversation around bias. And so now I'm really excited that somehow like that art experiment that I did and the mathematics is all coming together and I'm able to do research on socially conscious algorithms.

Steve: So Swathi, one of the things that we always ask all of our guests on The Uncommon Engineer— what makes you an uncommon engineer?

Swati: I've been trained all my life believing data and looking at numbers and taking them at face value, and now I've stopped doing that. So I think that's what makes me an uncommon engineer.

Steve: Well, this has been an absolutely fascinating conversation. I think there's so much coming in our future around machine learning and artificial intelligence, and I'm really grateful that we have someone here— hopefully more than just one— studying the role or the potential for bias because these algorithms are going to become increasingly powerful and important. And so we're really fortunate to have you here at Georgia Tech. Thank you so much for coming on The Uncommon Engineer, and thank you for all the great work that you're doing for so many people.

Swati: Thank you, Steve. And thanks to all of my collaborators and students and actually this entire field that's coming up and exciting with a lot of different areas and fields having that conversation.

Steve: Thank you very much. Tune in next month for part three of our AI series on putting AI to work in the business world. That's all for now for The Uncommon Engineer. I'm Steve McLaughlin and thanks for listening.