Fastai Course Chapter 3 Q&A on WSL2 | by David Littlefield


Table of Contents

An answer key for the questionnaire at the end of the chapter

Image by Joel Filipe

The 3rd chapter of the textbook provides an overview of ethical issues that exist in the field of artificial intelligence. It provides cautionary tales, unintended consequences, and ethical considerations. It also covers biases that cause ethical issues and some tools that can help address them.

We’ve spent many weeks writing the questionnaires. And the reason for that, is because we tried to think about what we wanted you to take away from each chapter. So if you read the questionnaire first, you can find out which things we think you should know before you move on, so please make sure to do the questionnaire before you move onto the next chapter.

— Jeremy Howard,

1. Does ethics provide a list of “right answers”?

Ethics doesn’t provide a list of “right answers” for solving moral problems. It does provide a set of principles that can help eliminate confusion, clarify an issue, and identify some clear choices. It can also help identify several of the “right answers” but each individual must still come to their own conclusion.

2. How can working with people of different backgrounds help when considering ethical questions?

Research findings suggest that the problem-solving skills of a diverse group outperform those of a group comprised of the most talented individuals. It helps add different perspectives from people with different experiences and identities that could have privileged access to insights and understandings that are relevant to the ethical issue. It can also help in forming policies and developing research and innovations that better cater to people’s needs.

3. What was the role of IBM in Nazi Germany? Why did the company participate as it did? Why did the workers participate?

IBM supplied the Nazis with data tabulation products that were used to track the extermination of Jews and other groups on a massive scale. It created a punch card system that categorized the way each person was killed, which group they were assigned to, and the logistical information used to track them through the vast Holocaust system. It also provided regular training and maintenance onsite at concentration camps such as printing off punch cards, configuring machines, and repairing machines.

The company’s President Thomas Watson has been accused of cooperating with the Nazis for the sake of profit. He received the special “Service to the Reich” medal in 1937. He also personally approved the release of the IBM alphabetizing machines to help organize the deportation of Polish Jews.

The workers were working-class men that were trying to live ordinary lives, care for their families, and do well at their jobs. It has been speculated that these project managers, engineers, technicians, and marketers were simply following orders. It has also been speculated that there was a mixture of motives such as the group dynamics of conformity, deference to authority, role adaptation, and the altering of moral norms to justify their actions.

4. What was the role of the first person jailed in the Volkswagen diesel scandal?

James Liang was an engineer at Volkswagen who was sentenced to 40 months in prison and ordered to pay a $200,000 fine for his role in the Volkswagen Diesel Scandal. He knowingly designed the software that detected when the vehicles were being tested and temporarily changed the engine performance accordingly to improve the results by up to 40 times.

5. What was the problem with a database of suspected gang members maintained by California law enforcement officials?

In 2016, a state audit revealed that the CalGang database contained many errors that diminished its crime-fighting value. It found flaws in the system such as little oversite, lack of transparency, policy violations, and trouble justifying why some people were added to the system. It also had no process in place to correct mistakes or remove people after they have been added.

6. Why did YouTube’s recommendation algorithm recommend videos of partially clothed children to pedophiles, even though no employee at Google had programmed this feature?

YouTube’s recommendation system is designed to increase the amount of time people spend on YouTube. It creates feedback loops that curate video recommendations for people based on their watch history and what similar people have watched to keep them watching. It also continues to optimize that metric which produces very popular playlists without discrimination.

7. What are the problems with the centrality of metrics?

The textbook explored some of the unexpected consequences of YouTube’s decision to optimize their recommendation system to maximize watch time. It incentivized content creators to produce longer and more frequent videos which focused on entertainment rather than quality or diversity of content. It also led to all kinds of extreme situations where people would search for, find, and exploit these situations and feedback loops for their advantage.

8. Why did not include gender in its recommendation system for tech meetups?

Meetup didn’t include gender in its recommendation system because they felt it was better to recommend meetups to their users regardless of gender. They noticed that men expressed more interest than women in technology meetups. They also concluded it would create a feedback loop that would cause even fewer women to find out about and attend technology meetups.

9. What are the six types of bias in machine learning, according to Suresh and Guttag?

Historical Bias is a bias that occurs in machine learning when the data that’s used to train the model no longer accurately reflects the current reality. It occurs even when the measurement, sampling, and feature selection are done perfectly because people, processes, and society are already biased.

Measurement Bias is a bias that occurs in machine learning when the wrong features and labels are measured and used. It occurs when the model makes errors because the wrong thing is measured, the right thing is measured the wrong way, or the measurement is incorporated into the model incorrectly.

Aggregation Bias is a bias that occurs in machine learning when the model can’t distinguish between the groups in the heterogeneous population. It occurs because the model assumes the mapping from inputs to labels is consistent across groups which are usually different in different groups.

Representation Bias is a bias that occurs in machine learning when the model fails to generalize well because part of the population is under-represented. It occurs because the model noticed a clear underlying relationship and assumed that the relationship would hold all the time.

Evaluation Bias is a bias that occurs in machine learning when the bench-mark data that’s used to measure the quality of the model doesn’t represent the population. It occurs because the benchmark data isn’t representative of the general population or appropriate for the way the model will be used.

Development Bias is a bias that occurs in machine learning when the model is used or interpreted in inappropriate ways. It occurs because the problem that the model is intended to solve is different from the way it’s being used. It also occurs when a system is built and evaluated as autonomous but isn’t.

10. Give two examples of historical race bias in the US.

In 2016, an independent investigation revealed that the COMPAS algorithm contained clear racial biases in practice. It found that black Americans were twice as likely as white Americans to be labeled as high risk but not actually re-offend. It also found that white Americans were much more likely than black Americans to be labeled as low risk but actually commit other crimes.

In 2012, a university study revealed that all-white juries were 16% more likely to convict a black defendant than a white one. It found that when there were no black jurors in the jury pool, black defendants were convicted 81% of the time, compared to 61% for white defendants. It also found that when there were one or more black jurors in the jury pool, black defendants were convicted 71% of the time, compared to 73% for white defendants.

11. Where are most images in ImageNet from?

The ImageNet dataset contains over 14 million images that were scraped from Flickr and image search engines. It includes mostly images from the U.S. and western countries because these images dominated the internet when the dataset was compiled. It also does worse on scenes from other countries because they don’t have as much representation in the dataset.

12. In the paper “Does Machine Learning Automate Moral Hazard and Error?” why is sinusitis found to be predictive of a stroke?

Sinusitis was found to be a predictor of stroke because the process didn’t actually predict stroke. It didn’t measure the biological signature of blood flow restriction to brain cells. It also used medical data which contained behavioral and biological data that included who had stroke-like symptoms, decided to seek medical care, and was tested and diagnosed by a doctor.

13. What is representation bias?

Representation Bias is a bias that occurs in machine learning when the model fails to generalize well because part of the population is under-represented. It occurs because the model noticed a clear underlying relationship and assumed that the relationship would hold all the time.

14. How are machines and people different, in terms of their use for making decisions?

Machines are used very differently than people when it comes to getting advice on making decisions. It gets assumed that algorithms are objective and error-free. It can also get implemented without an appeals process in place, at scale, and much cheaper than the cost of human decision-making.

15. Is disinformation the same as “fake news”?

Disinformation is false or misleading information that’s been presented for the purpose of manipulation. It usually has the intention to cause economic damage, manipulate public opinion, or generate monetary profits. It also contains exaggerations and or seeds of truth that are taken out of context.

Fake News is false or misleading information that’s presented as legitimate news. It usually has the intention to damage the reputation of a person or entity or make money through online advertising revenue. It also contains purposefully crafted, sensational, emotionally charged, misleading, and or totally fabricated information that mimics the form of mainstream news.

16. Why is disinformation through autogenerated text a particularly significant issue?

The negative societal implications of text generation models are fake news and the spread of disinformation. It could be used to produce compelling content on a massive scale with far greater efficiency and lower barriers to entry. It could also be used to carry out socially harmful activities that rely on text such as spam, phishing, abuse of legal and government processes, fraudulent academic essay writing, and social engineering pretexting.

17. What are the five ethical lenses described by the Markkula Center?

Ethical Lenses are a conceptual framework that’s meant to help technology companies embed ethical considerations into their workflow and promote the development of ethical products and services. It includes theories that are widely used by both academic and professional ethicists. It also includes theories that are largely in the context of Western philosophical thought.

The Rights Approach:
Which option respects the rights of all who have a stake?

The Justice Approach:
Which option treats people equally or proportionately?

The Utilitarian Approach:
Which option will produce the most good and do the least harm?

The Common Good Approach:
Which option serves the community as a whole, not just some members?

The Virtue Approach:
Which option leads me to act as the sort of person I want to be?

18. Where is policy an appropriate tool for addressing data ethics issues? Further Research

Policies that address data ethics issues become a priority for companies when there are heavy financial and legal consequences that are imposed by regulations and laws. It can become necessary to protect the public through coordinated regulatory actions for data ethics issues that violate human rights and are impossible to solve through individual purchase decisions.


Leave a Comment