In this article, I will discuss Important Questions in Artificial Intelligence Class 10. The CBSE board question paper of AI Class 10 consists of 1 mark, 2 marks, and 4 marks questions.
This article – Important Questions Artificial Intelligence Class 10 will provide similar kinds of questions. So here we go!
Unit 1 Introduction to AI Class 10
Unit 1 Introduction to AI consists of the following topics and sub-topics. I have prepared questions for you as per the topic. Follow the links to access them
|Sub Unit No||Sub Unit||Questions Link|
|1||Foundational Concepts of AI||25+ Questions Foundational Concepts of AI|
|2||Introduction to AI Domains (Data, CV & NLP)||20+ Questions AI Domains|
|3||AI ethics||40+ Questions AI Ethics|
Now in the next section of this article, you will get important questions for the unit 2 AI Project cycle.
Unit 2 AI Project Cycle Class 10
Unit AI Project cycle class 10 has 5 subunits. Let’s explore the questions from Unit 2 AI project cycle class 10.
|Sub Unit No||Sub Unit||Questions Link|
|1||Problem Scoping||15+ Questions Problem Scoping|
|2||Data Acquisition||30+ Questions Data Acquisition|
|3||Data Exploration||15+ Questions Data Exploration|
|4||Data Modelling||15+ Questions Data Modelling|
|5||Neural Network||15+ Questions Neural Network|
Follow this link for MCQ Questions based on Introduction to Artificial Intelligence and AI Project Cycle.
The important questions Artificial intelligence class 10, 1 mark questions may need one word, MCQ, fill in the blanks, or True/False as the answer. So let’s start!
Unit 6 Natual Language Processing AI Class 10
Let us start with 1-mark questions which include MCQs, Fill in the blank and other objective-type questions. Here we go!
Watch the video for more understanding:
Natural Language Processing AI Class 10 – 1 Mark Questions
Here we go with Natural Language Processing AI Class 10 – 1 Mark Questions.
 What are the domains of Artificial Intelligence?
1. Data Science
2. Computer Vision
3. Natural Language Processing
 Identify: I work around numbers and tabular data
Ans. Data Science
 I am all about visual data like images and video. – Who am I?
Ans. Computer Vision
 What do you mean by natural language processing?
Ans. NLP refers to data taken from the natural language spoken by human used by them in daily life and operates on them.
 Name the AI game you played which uses NLP.
Ans. Mystery Animal
Follow this link to know more about the game: Mystery Animal
 Sagar is collecting data from social media platforms. He collected a large amount of data. but he wants specific information from it. Which NLP application would help him?
Ans.: Automatic Summarization
 What are the features of automatic summarization?
1. Summarizing the meaning of documents and information
2. Understand the emotional meanings within the information
3. Provide an overview of a news item or blog post
4. Avoid redundancy from multiple sources
5. Maximize the diversity of content
 I used to identify opinions online to help what customers think about the products and services. Who am I?
Ans. Sentiment Analysis
 Which application of NLP assigns predefined categories to a document and organize it to help customer to find the information they want?
Ans. text classification
 What is an example of text classification?
Ans. Spam Filtering in email, Auto tagging customer queries, understadning audience response from social media, categorization of news articles on specific topics.
 What is an example of automatic summarization?
Ans. media monitoring, newsletters, video scripting, automated content creation
 I am helping humans to keep notes of their important tasks, make calls for them, send messages and many more. Who am I?
Ans. Virtual Assistance or Chatbot
 Name popular virtual assistants.
1. Apple Siri
2. Gogole Assistant
3. Amazon Alexa
4. Microsoft Cortana
 Give the full form of CBT.
Ans. CBT – Cognitive Behavioural Therapy
 How does CBT help human beings?
Ans. CBT understands the behaviour and mindset of an individual. It helps therapist to overcome their stress and make their life happy.
 what do you mean by chatbots?
Ans. A chatbot is a computer program that’s designed to simulate human conversation through voice commands or text chats or both. Eg: Mitsuku Bot, Jabberwacky etc.
A chatbot is a computer program that can learn over time how to best interact with humans. It can answer questions and troubleshoot customer problems, evaluate and qualify prospects, generate sales leads and increase sales on an ecommerce site.
A chatbot is a computer program designed to simulate conversation with human users. A chatbot is also known as an artificial conversational entity (ACE), chat robot, talk bot, chatterbot or chatterbox.
A chatbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent.
Note: Write any definition.
 What are the types of chatbots?
 The customer care section of various companies includes which type of chatbot?
Ans.: Most of the customer case section of various companies includes scriptbot.
 The virtual assistants like Siri, Cortana, google assistant etc. can be taken as which type of chatbots?
Ans. The virtual assistants like Siri, Cortana, google assistants etc can be taken as a smartbot.
 Give the full form of NLP.
Ans.: NLP – Natural Language Processing
 What do you mean by syntax?
Ans.: The grammatical structure of a sentence is known as syntax. This structure help to interprets the message.
 How the computer interprets the syntax of a language?
Ans. The computer can interprets the syntax of language using part of speech tagging. The part of speech tagging allows to identify the different parts of a speech.
 What do you mean by semantics?
Ans.: The meaning of word or sentence is known as semantics.
 Write an example of different syntax, same semantics.
Ans. 5 + 3 = 3 + 5
 Write an example of different semantics, same syntax.
Ans.: 2/3 in python 2.7 is not same in python 3. As in python 2.7, 2/3 returns 1 where as in python 3 it returns 2.7
 What do you mean by perfect syntax, no meaning?
Ans. Perfect syntax, no meaning refers to statement which have a correct syntax but it does not convey any meaning.
 What do you mean by corpus?
Ans.: In text normalization, all the text from all documents altogether is known as corpus.
 What is sentence segmentation in text normalization?
Ans.: In text normalization the whole text is divided into different sentences. This process is known as sentence segmentation. Each sentence is taken as a different data.
 Name the term is used for any word or number or special character occurring in a sentence.
 In which processes every word or number or special character is considered separately?
Watch this video for more understanding:
Natural Language Processing AI Class 10 – Short Answer Questions
Let’s see some short answer questions from Natural Language Processing AI Class 10.
 Mr. Dimpesh Chavda is an English teacher. He suggested removing the frequently used words which do not add any value to the paragraph. If someone wants to do this using NLP. Suggest the term is used for these types of words.
Ans. Stop words
 Write examples of a few stop words.
Ans. a, an, and, are, as, for, it, is, into, in, if, on, or, such, the, like, there, to, while etc.
 Which step comes after stop words removal?
Ans. Converting the text to common case
 What do you mean by stemming?
Ans. Stemming is a technique used to extract the base form of the words by removing affixes from them. It is just like cutting down the branches of a tree to its stems. For example, the stem of the words eating, eats, eaten is eat.
Stemming is the process in which the affixes of words are removed and the words are converted to their base form.
Stemming algorithms work by cutting off the end or the beginning of the word, taking into account a list of common prefixes and suffixes that can be found in an inflected word.
 The stemmed words are not meaningful quite often. (True/False)
 Write stemmed words for these: reading, reader, flies,flying,
Ans. 1. reading – read (Affix – ing)
2. reader – read (Affix – er)
3. flies – fli (Affix- es)
4. flying – Fly (Affix – ing)
 What do you mean by lemmatization?
Ans.: Lemmatization is the grouping together of different forms of the same word. In search queries, lemmatization allows end users to query any version of a base word and get relevant results.
In lemmatization, the word we get after affix removal (also known as lemma) is a meaningful one. Lemmatization makes sure that lemma is a word with meaning and hence it takes a longer time to execute than stemming.
Lemmatization on the other hand, takes into consideration the morphological analysis of the words. To do so, it is necessary to have detailed dictionaries which the algorithm can look through to link the form back to its lemma.
 Nainesh wants to convert the tokens into numbers. Suggest the algorithm to do the same.
Ans.: bag of words
 What will be the output produced by a bag of words algorithm?
Ans. : The bag of words algorithm returns
1. A vocabulary of words for the corpus
2. The frequency of words
 Write step by step approach to implement the bag of words algorithm.
Ans. The steps are as following:
1. Text Normalization
2. Create Dictionary
3. Create document vectors
4. Create document vectors for all documents
 What do you mean by dictionary?
Ans. All the unique words written from the corpus in NLP is known as dictionary. The repeated words are written just once in dictionary.
 How to create a document vector?
Ans.: In this step, the vocabulary is written on top row. now for each word in the document, if it matches with the vocabulary, put a 1 under it. If the same word appears again, increment the previous value by 1. If word does not occur in that document, then put 0.
 What is a document vector table?
Ans.: After writing all values of 0s and 1s in document vector the table is created finally. This table is known as document vector table that represents the positioning of 0s and 1s in it.
 What is the full form of TFIDF?
Ans.: TFIDF stands for Term Frequency & Inverse Document Frequency.
 What is TFIDF?
Ans.: TFIDF is a statistical measure used to evaluate the importance of a word to a document in a collection or corpus.
 Differentiate between stop words and frequent words.
Stopwords are those words which are used mostly repeated in the corpus but do not provide any information regarding corpus. They are filtered out before or after processing of text.
After filtering the stop words, the occurrence level drops drastically and the words which have adequate occurrence in the corpus are said to have some amount of value and are termed as frequent words.
 What are rare words?
Ans.: The words mostly talk about the document’s subject and their occurrence is adequate in the corpus. Then as the occurrence of words drops further the value of such rises. These words are termed as rare words. The rare words occurres at least but add the most value to the corpus.
 What is term frequency?
Ans.: From the document vector table, the frequency of each word in a document is found. This frequency is known as term frequency.
 Rajvi is learning NLP. She wants to know what is document frequency, Suggest your answer.
Ans.: Document frequency is the number of documents in which the word occurs irrespective of how many times it has occurred in those documents.
 Radhika is learning TFIDF. She is not getting the term inverse document frequency. Explain to her what is inverse document frequency.
Ans.: The total number of documents is the numerator and the document frequency is the denominator.
Example: If there are 7 documents, out of which 4 documents contain the word “THERE”, thus the document frequency is 4. Hence the inverse document frequency is 7/4.
 Write the formula of TFIDF.
Ans.: The formula of TFIDF for any word W is:
TFIDF(W)=TF(W) * log(IDF(W))
 What are the applications of TFIDF?
Ans.: The applications of TFIDF are as follows:
1. Document Classification
2. Topic Modelling
3. Information Retrieval System
4. Stop word filtering
 Does the vocabulary of a corpus remain the same before and after text normalization?
Ans.: No, the vocabulary of a corpus does not remain same.
 Name the package used for Natural Language Processing in Python.
Ans.: Natural Language Toolkit
 Arjun wanted to learn the leading platforms for building python programs that can work with human language data. Suggest the package name.
Ans.: Natural Language Toolkit
 Do the segmentation of the following sentence.
This is a chatbot. A chatbot is an application of NLP.
This is a chatbot.
Chatbot is an application of NLP.
 Tokenize the above sentence.
 Gagan is working on NLP model. Observe the following words and write the step which is being followed for them:
Natural, NaTuRaL, NATural, NATural, NaturaL, NAturAL
Ans.: For these words, it should be converted into common case first.
 Shiv is learning NLP. Write one example of common syntax but no meaning for him.
Ans.: The players are playing themselves better.
 Which one do you prefer over a smart bot or script bot for better functions?
 “Lemmatization is more complex than stemming.” Justify this statement.
Ans. Lemmatization takes longer to execute than stemming. It finds the exact word meaning. Hence it is complex.
 “Chickens feed extravagantly while the moon drinks tea.” This is an example of
Ans.: Perfect Syntax, No Meaning
 Observe the following table and write which algorithm will return the same output?
|Class 10 Artificial Intelligence syllabus – Artificial intelligence is new subject in skill courses. In Class 10 you will learn about NLP.||10: 2|
Ans. Bag of Words
 Rearrange the steps of the Bag of words algorithm in proper order:
Create Dictionary -> Create document vector for all documents -> Create document vector -> Text normalization
Ans.: Text Normalization -> Create Dictionary -> Create document vector -> Create document vector for all documents
 What will be the output of “bodies” in stemming and lemmatization?
a) The output of the word bodies after stemming will be – bodi
b) The output of the word bodies after Lemmatization will be – body
 How many tokens are there in the following sentence?
Artificial intelligence (AI) – is the ability of a computer or a robot controlled by a computer to do tasks that are usually done by humans because they require human intelligence and discernment.
Ans.: 35 tokens are there in the above sentence.
 Identify any 2 stop words, frequent words and rare/valuable words from the following paragraph:
Data preprocessing involves preparing and “cleaning” text data for machines to be able to analyze it. preprocessing puts data in the workable form and highlights features in the text that an algorithm can work with.
stop words: and, for, to, be, it, in, the, that, an, can , with
frequent words: data, involves, preprocessing, text, machines, able, puts, workable, form, work
rare words: preparing, cleaning, analyze, highlights, features, algorithm
 What are the subfields of AI?
Ans. The subfields of AI are: linguistics, computer science, information engineering etc.
Unit 6 Natural Language Processing – 2 Marks Questions
In this section of Natural Language Processing AI Class 10, we are going to discuss the questions of 2 marks. Here we go!
 What do you mean by Natural Processing? What is the main aim of NLP?
Natural Language Processing or NLP is one of the sub-fields of AI tht is focused onenabling computers to understand and process human languages.
The main aim of NLP is for computers to achieve human-like comprehensions of text input.
 Pratik is working on a huge amount of data that is full of information. Now he wants to access a specific, important piece of information from it. Explain which NLP application helps him.
Automatic Summarization will help to Pratik to complete his task. It is not only for summarizing the meaning of documents and information, but also to understand the emotional meanings within the information. For example collecting data from social media.
Automatic Summarization is specially relevant when used to provide an overview of a news item or blog post, while avoiding redundancy from multiple sources and maximizing the diversity of content obtained.
 How does sentiment analysis help companies to understand what customer thinks about their product?
The goal of sentiment analysis is to identify the response from several posts or even in same post where emotion is not always explicitly expressed.
It helps to understand customer’s emotions and opinions online about their products and services. For example, “I love new model, but sometimes it didn’t respond well and stuck in between”.
So by this the decision makers can understand about their product reputation and understand the reason behind or try to find it and help them to improve the mode.
 Seema is a maths teacher. She has conducted a class test and pre-defined three categories for students based on marks scored by students. Now she wants to use the NLP application to organize students’ data. Explain the NLP application in detail to help her.
Seema can use text classification to organize students’ data. It makes possible to assign pre-defined categories to a document and organize it to help you find the information she need or simplify some activities.
Seema can assign some tags or categories to text according to marks scored by students.
 Which NLP applications access our data, and they can help us in keeping notes of our tasks, making calls for us, sending messages, and a lot more. Explain this NLP application in detail.
Virtual Assistants access our data, and help us in keeping notes of our tasks, make calls for us and send messages and a lot more.
They not only allows users to talk with them but they also have the abilities to make their life easier. With the help of speech recognition, these assistants can not only detect our speech but can also make sense out of it.
 List out any four chatbots along with their developer.
Ans. Few chatbots are as following:
1. Mitsuku created from Pandorabots AIML technology.
2. Cleverbot was created by British AI scientist Rollo Carpenter.
3. Jabberwacky was also create by Rollo Carpenter.
4. Haptik bot was created by Aakrit Vaish and Swapan Rajdev, both University of Illinois engineering alumni
 What do you mean by syntax? Explain in detail.
Ans.: The grammatical structure of a sentence. The grammatical structure contain nouns, verbs, adverbs, adjectives, and some rules to prepare a structure. Another part of grammatical structure is part of speech.
 Explain with an example:
- Perfect Syntax, no meaning
- Multiple meanings of a word
1. Perfect Syntax, no meaning – The sentence which is grammatically correct but it does not make any sense. In human language, a perfect balance of syntax and semantics is important for better understanding. Human is communication is complex. For example, Chcikens feed extravagantly while the moon drinks tea.
2. Multiple meaning of a word – In Natural language it is important to understand that a word can have multiple meanings and the meanings fit into the sentence according to the context of it. For example:
1. His face turned red after he found out that he took the wrong bag.
2. The red car zoomed past his nose.
3. His face turns red after consuming the medicine
 Rekha is learning NLP. She wants to know about the concept that is used to convert natural language to machine language. Explain in the concept detail.
Ans.: When user is working with computers, it is necessary to convert the human natural language to machine language. Computer understands only machine language. This machine language uses numbers. Hence to do this in computers a few steps needs to happen. The first step is text normalization.
 Explain different syntax, same semantics with example.
Ans.: There are some statements which are written differently but their meanings are same. These types of statements are known as different syntax, same semantics.
For example, 3 + 4 = 4 + 3.
In this example left side and right side both statements are written differently but semantics are same. They will return the same output.
 Manoj is working in Python 2.7 whereas Anuj is working in Python 2. They have written the same statement 2/3. But both get different outputs. Help them to understand the concept.
Ans.: Statement written have the syntax but their meanings are different is known as different semantics, same syntax.
The above statement Manoj will get 1 where as Anuj get 1.5 as per their version. Which is known as different semantics, same syntax.
 Saurabh is working on NLP based project. He wanted to know about the concept where the entire corpus is being divided into sentences. Help hip and explain the concept.
Ans.: Sentence segmentation allows to divide the entire corups into sentences. Each sentece is taken as data into sentence segmentation and the whole corpus gets reduced to sentences.
Natural Language Processing can be stated in layman terms as the automatic processing of the natural human language by a machine. It is a specialized branch of Artificial Intelligence that primarily focuses on interpretation as well as human-generated.
1. Natural Language Processing can be stated in layman terms as the automatic processing of the natural human language by a machine.
2. It is a specialized branch of Artificial Intelligence that primarily focuses on interpretation as well as human-generated.
 “Under tokenisation, every word, number and special character is considered separately and each of them is now separate token”. Is this statement is correct? Explain.
Ans.: Tokens refers to any word, or number or special character occurring in a sentence. Therefore every word, number and special character is considered separately in tokenisation.
Tokens are the building blocks of Natural Language. Tokenization is a way of separating a piece of text into smaller units called tokens. Here, tokens can be either words, characters, or subwords.
There are 38 tokens in the above sentences.
 In a few cases, special characters are not considered as stop words and they are not removed from the corpus. Explain such a case in short.
Ans.: Frequently repeated words in the corpus that doesn’t add any value to the paragraph is known as stopwords. Special characters also considered as stop words as they did not add any value to the sentence. But few cases like e-mail ID or website URL the @ and . plays an important role. So in these cases these symbols should be considered as either frequent words or rare words.
 Raman has a single word in different cases. Somewhere it’s written in capital, somewhere it is written in small letters. Suggest the step which is being followed after stop word removal in NLP data processing and explain it.
Ans.: After removing stop words from the corpus we have whole text with different cases. Some words are present in the corpus with small letters and capital letters in int. To identify them all user need to convert them into a common case.
Common case is the process of converting the text into the similar case. The most preferred case is lower case. The common case ensures that case-sensitivity of machine does not consider same words as different just because of different cases.
For example, NATURAL, Natural, nAturaL, NaTuRaL, NATural, natuRAL
All of these treated as same word by the machine.
 What is stemming? What is the purpose of stemming?
Ans.: Stemming refers to the process of converting corpus into the base form of words after removing the affixes or suffixes. In stemming the extracted words cannot be meaningful. It just removes the affixes from the words. Stemming does not take into account if the stemmed word is meaningful or not. It is faster in the process.
 Write affixes and stemmed words for the following words:
controlled, controlling, controller, buddies, hobbies, thoughtful, meaningful, powerful
controlled : Affix – ed, Stemmed word – controll
controlling: Affix – ing, Stemmed word – controll
controller: Affix – er, stemmed word – controll
buddies: Affix – es, stemmed word – buddi
hobbies: Affix – es, stemmed word – hobbi
thoughtful : Affix – ful, stemmed word – thought
meaningful : Affix – ful, stemmed word – meaning
powerful: Affix – ful, stemmed word – power
 What do you mean by lemmatization?
Ans. In lemmatization, the user will get the word after removal of affixes. The words which is extracted from the original word is known as lemma. This process is little bit slow as it will find the word with a proper meaning. Its execution is little bit slow compared to stemming.
hobbies: affix – es, lemma – hobby
 Explain the bag of words algorithm implementation as step by step process.
Ans.: The bag of words algorithm implemented through these steps:
1. Text Normalization: In this step, all data will be collected. After collecting the data it will be separated by words like tokenization.
2. Create Dictionary: In this step, all the words will be written in as it is. All the words occurring in the all documents will be written. The repeated words are written just once.
3. Create Document Vectors: Put all the words on the top as a header of table and write the frequency of each word.
4. Create Document vectors for all documents: Do the same for all documents.
 Create document vector for given corpus.
Document 1: We are going to play cricket.
Document 2: We love cricket.
Document 3: We are playing cricket of 20 overs for each inning.
Document 4: We are playing cricket every Sunday.
 Do sentence segmentation and tokenization for the following:
‘Tokenization does not use a mathematical process to transform the sensitive information into the token. There is no key or algorithm that can be used to derive the original data for a token. ‘
Tokenization does not use a mathematical process to transform the sensitive information into the token.
There is no key or algorithm that can be used to derive the original data for a token.
[Tokenization],[does],[not],[use],[a],[mathematical], [process], [to], [transform], [the], [sensitive], [information], [into], [token], [there], [is], [no], [key], [or], [algorithm], [that], [can], [be], [used], [derive], [original], [data], [for]
 Differentiate between scriptbot and smartbot.
1. They are easy to make
2. Work through the program or script
3. It has limited functionality
4. They are less powerful
1. They are difficult to make
2. Work on big databases and other resources directly
3. It has wide functionality
4. They more powerful
Important Questions Artificial Intelligence Class 10 – Natural Language Processing 4 Marks Questions
In this section, I will discuss some competency-based questions for AI class 10.
 Calculate TFIDF for the given corpus and mention the word(s) having the highest value.
Document 1: Radha is an intelligent girl.
Document 2: She is studying in class X.
Document 3: She has opted AI in Class X.
Document 4: She is enjoying AI.
Ans.: Term Frequency refers to the frequency of words in one document. As step by step process firstly we have to calculate Term Frequency then Inverse Document Document Frequency. So let’s prepare the document vector table for the given corpus.
Now calculate the document frequency for the exemplar vocabulary. Document frequency refers to the number of documents in which the word occurs irrespective of how many times it has occurred in those documents. The document frequency is calculated as per the following table:
Now we need to put the document frequency in the denominator while the total number of documents is the numerator. Here total no. of documents are 4. Therefore the inverse document frequency is as the following table:
Finally, the IDF values will be derived by multiplying the IDF values to TF values. The formula of TDIDF for any word W is:
TDIDF(W) = TF(W)*log(IDF(W))
So the table would be:
Now put the approx values for each word as mentioned in the following table:
 Summarize the concept of TFIDF.
Ans.: Summarising the concept of TDIDF, we can say that:
1. Words that occur in all the documents with high term frequencies have the least values and are considered to be the stopwords.
2. For a word to have high TFIDF value, the word needs to have a high term frequency but less document frequency which shows that the word is important for one document but is not a common word for all documents.
3. These values help the computer understand which words are to be considered while processing the natural language.
4. The higher the value, the more important the word is for a given corpus.
 The world is competitive nowadays. People face competition in even the tiniest tasks and are expected to give their best at every point in time. When people are unable to meet these expectations, they get stressed and could even go into depression. We get to hear a lot of cases where people are depressed due to reasons like peer pressure, studies, family issues, relationships, etc. and they eventually get into something that is bad for them as well as for others. So, to overcome this, Cognitive Behavioural Therapy (CBT) is considered to be one of the best methods to address stress as it is easy to implement on people and also gives good results. This therapy includes understanding the behaviour and mindset of a person in their normal life. With the help of CBT, therapists help people overcome their stress and live a happy life.
For the situation given above,
- Write the problem statement template
- List any two sources from which data can be collected.
- How do we explore the data?
- The problem statement for above-given scenario would be:
|Our||people facing a stressful situation||Who?|
|have a problem that||not able to share their feelings||What?|
|while||need to help to go out their emotions||Where?|
|An ideal solution would be||to provide a platform to share their thoughts anonymously and suggest help whenever required||Why?|
2. Data can be collected from various sources, a few of them are as follows:
- Interviews of therapists
- Online Databases
- Observations of different people and therapists
3. Once the textual data has been collected, it needs to be processed and cleaned so that an easier version can be sent to the machine. Thus, the text is normalised through various steps and is lowered to minimum vocabulary since the machine does not require grammatically correct statements but the essence of it.
 Information overload is a real problem when we need to access a specific, important piece of information from a huge knowledge base. Automatic summarization is relevant not only for summarizing the meaning of documents and information, but also to understand the emotional meanings within the information, such as in collecting data from social media. Automatic summarization is especially relevant when used to provide an overview of a news item or blog post, while avoiding redundancy from multiple sources and maximizing the diversity of content obtained.
For the situation given above,
- What do you understand by automatic summarization?
- In which condition automatic summarization is relevant?
- Prepare a problem statement for the given situation.
- Automatic Summarization is relevant not only for summarizing the meaning of documents and information but also for understanding the emotional meanings within the information such as collecting data from social media.
- Automatic summarization is especially relevant when used to provide an overview of a news item or blog post, while avoiding redundancy from multiple sources and maximizing the diversity of content obtained.
- The problem statement for the given situation is as following:
|Our||Internet users and Social Media Users||Who?|
|have a problem that||Overloaded information and accessing specific pieces of information from a knowledge base||What?|
|while||a news item or blog post or multiple sources of information||Where?|
|An ideal solution would be||to provide a platform to summarize the information needed||Why?|
 There are three domains of AI: Data Science works around numbers and tabular data while Computer Vision is all about visual data like images and videos. The third domain, Natural Language Processing (commonly called NLP) takes in the data of Natural Languages which humans use in their daily lives and operates on this.
Natural Language Processing, or NLP, is the sub-field of AI that is focused on enabling computers to understand and process human languages. AI is a subfield of Linguistics, Computer Science, Information Engineering, and Artificial Intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyse large amounts of natural language data.
Answer these questions based on the above text:
- What are the three domains of AI?
- State various domains of AI for the following:
- Student result data
- Google Map
- Google Assistant
- Spam Filter
- CSV downloaded from Kaggle.com
- Define NLP.
- There are three domains of AI.
- Data Science
- Computer Vision
- Natural Langauge Processing
- Various Domains of AI
- Student result data: Data Science
- Google Map: Computer Vision
- Google Assistant: Natural Language Processing
- Spam Filter: Natural Language Processing
- CSV downloaded from Kaggle.com: Data Science
- NLP stands for Natural Language Processing. Natural Language Processing, or NLP, is the sub-field of AI that is focused on enabling computers to understand and process human languages.
 Nowadays Google Assistant, Cortana, Siri, Alexa, etc have become an integral part of our lives. Not only can we talk to them but they also have the ability to make our lives easier. By accessing our data, they can help us in keeping notes of our tasks, making calls for us, sending messages and a lot more. With the help of speech recognition, these assistants can not only detect our speech but can also make sense of it. According to recent research, a lot more advancements are expected in this field in the near future.
Answer the following questions:
- Name a few virtual assistants.
- What is the role of virtual assistants? How do they impact our life?
- How do virtual assistants work?
- Google Assistant, Cortana, Siri, Alexa
- Virtual Assistants can assist us in various tasks. They can talk to us, keep notes of our tasks, make calls for us, and send messages. They make out lives easier by accessing our data and detecting our speech.
- Virtual Assistants use speech recognition and try to understand the command given through speech. They detect our speech and make sense of it.
 Observe the graphs carefully and classify them according to how well the model’s output matches the data samples:
Figure 1: the model’s performance matches well with the true function which states that the model has optimum accuracy and the model is called a perfect fit.
Figure 2: The model’s output does not match the true function at all. Hence the model is said to be underfitting and its accuracy is lower.
Figure 3: model performance is trying to cover all the data samples even if they are out of alignment to the true function. This model is said to be overfitting and this too has lower accuracy.
 Suhas is working in a company named Krishna Enterprise as HR Head. He is facing problems from remote workers of his company. Remote workers are not able to communicate with one another seamlessly and easily. Sometimes network issues occur, hence the communication gap remains there. Sometimes it creates misunderstanding.
Answer the following questions based on the above situation:
- Prepare 4ws canvas for the problem.
- Write few methods to visualize the data.
- Why human communication is complex for machines?
|Our||remote workers and company professionals||Who?|
|have a problem that||Effective Communication||What?|
|while||remote workers work in the mines or remote area||Where?|
|An ideal solution would be||to provide a platform to communicate easily,||Why?|
2. To visualize data there are various methods, Suhas can use any of these:
- MS Excel
- Python Matplotlib
- MS Power BI
3. Human language contains various rules based on grammatical structure. The machines need to be prepared in such a manner that can identify these rules and apply them to the model. Machines have their own rules to understand the language rules. As the computer understands 0s and 1s only.
In the next section of Important Questions Artificial Intelligence Class 10, we are going to discuss questions from Unit 7 Evaluation.
Unit 7 Evaluation Important Questions Artificial Intelligence Class 10
Let’s see 1 marks questions first. As you know 1 mark questions include short definitions, one-word answers, fill in blanks etc. Here we go! If you looking for notes for the same unit follow this link:
Unit 7 Evaluation AI Class 10 – 1 mark Questions
 Define: Evaluation
Ans.: Evaluation is the process of understanding the reliability of any AI model, based on outputs by feeding the test dataset into the model and comparing it with actual answers.
 Name two parameters considered for the evaluation of a model.
Ans.: The two parameters considered for evaluation of a model are:
 What is not recommended to evaluate the model?
Ans.: Its not recommended to use the data used to build the model to evaluate the model.
 Define overfitting.
Ans.: The model simply remember the whole training data set and will always predict the correct label for any point in the training set. This is known as overfitting.
 Enlist the data sets used in AI modeling.
Ans.: There are two types of datasets used in AI.
1. Training Data Set
2. Testing Data Set
 What do you mean by prediction?
Ans.: Prediction refers to the output produced by the AI model.
 What is reality?
Ans.: Reality refers to the real scenario, when the prediction has been made by a model.
 What are the cases considered for evaluation?
1. True Positive
2. True Negative
3. False Positive
4. False Negative
 Write the term used for the following cases for heavy rain prediction:
1. True Positive
2. True Negative
3. False Positive
4. False Negative
 What do you mean by True Positive?
Ans.: True Positive refers to a condition occurs when both predictions done by AI model and reality are True or Yes.
 What is True Negative?
Ans.: When both prediction and reality both are False or No, this condition is called True Negative.
 What is a False Positive?
Ans.: When the prediction is predicted positive incorrectly and the reality is negative, this condition is known as False Positive.
 What is False-negative?
Ans.: When the actual value predicted by AI model is false incorrectly and actual value is positive, this condition is known as False negative.
 Ritika is learning evaluation. She wants to recognize the concept of evaluation from the below-given facts:
- A comparison between prediction and reality
- Helps users to understand the prediction result
- It is not an evaluation of matric
- A record that helps in the evaluation
Help Ritika by giving the name to recognize the concept of evaluation.
Ans.: The concept is Confusion Matrix
 What is the need for a confusion matrix?
Ans.: The confusion matrix allows to understand the prediction results by an AI model.
 Devendra is confused about the condition when is the prediction said to be correct, support your answer to help him to clear his confusion.
Ans.: If the prediction predicted by the AI model or machine matches with the reality, this is known as the prediction said to be correct.
 Mentions two conditions when prediction matches reality.
Ans.: The two conditions when prediction matches reality are:
1. True Positive
2. True Negative
 Rozin is a student of class 10 AI. She wants to know the methods of evaluation. Support her with your answer.
Ans.: The evaluation methods are:
4) F1 Score
 Mihir is trying to learn the formula of accuracy. What is the formula?
 If a model predicts there is no fire where in reality there is a 3% chance of forest fire breaking out. What is the accuracy?
Ans.: The elements of the formula are as following:
1. True Positive: 0
2. True Negative:97
3. Total Cases: 100
 What do you mean by precision?
Ans.: The percentage of true positive cases versus all the cases where the prediction is True is known as prediction.
 Which cases are taken into account by precision?
Ans. Ture Positives and False Positives cases are taken into account by precision.
 Which cases are taken into account by the recall method?
Ans.: True Positives and False Negatives cases taken into account by recall method.
 Which measures are used to know the performance of the model?
Ans.: There are two measures used to know the performance of the model:
 Rohit is working on the AI model. He wanted to know the balance between precision and recall. What it is?
Ans.: The balance between precision and recall is known F1 score.
 The task is to correctly identify the mobile phones as each, where photos of oppo and Vivo phones are taken into consideration. Oppo phones are the positive cases and Vivo phones are negative cases. The model is given 10 images of Oppo and 15 images of Vivo phones. It correctly identifies 8 Oppo phones and 12 Vivo phones. Create a confusion matrix for the particular cases.
Ans.: The confusion matrix is as follows:
|Reality||Negative||True Negative: 12||False Positive: 3|
|Reality||Positive||False Negative: 2||True Positive: 8|
 There are some images of boys and girls. The girls are positive cases and boys are negative cases. The model is given 20 images of girls and 30 images of boys. The machine correctly identifies 12 girls and 23 boys. Create a confusion matrix for the particular cases.
Ans.: The confusion matrix is as follows:
|Reality||Negative||True Negative: 23||False Positive: 7|
|Reality||Positive||False Negative: 8||True Positive: 12|
 There is data given for Facebook and Instagram users. The model is given data for 200 Facebook users and 250 Instagram users. The machine identified 120 Facebook users correctly and 245 users of Instagram correctly. Create a confusion matrix for the same.
Ans.: The confusion matrix is as follows:
|Reality||Negative||True Negative: 120||False Positive: 80|
|Reality||Positive||False Negative: 5||True Positive: 245|
 Consider that there are 10 images. Out of these 7 are apples and 3 are bananas. Kirti has run the model on the images and it catches 5 apples correctly and 2 bananas correctly. What is the accuracy of the model?
Ans.: Total correct predictions are: 5 + 2 = 7
Total predictions made: 5 + 2
So accuracy is: 7/7 = 100%.
The model does not predicted all of the images, but whatever predictions it makes are correct. Hence accuracy is 100%.
 There are 16 images, 9 are cat images and 7 are dog images. The cat images are positive cases and dog images are negative cases. The model identifies 5 cat images correctly and 3 cat images as dog images. Similarly, it identifies 4 of them correctly as dog images. Find the accuracy of the model.
Ans.: Total Predictions made: 5 + 3 + 4 = 12
Total Correct Predictions made: 5 + 4 = 9
So the accuracy is: 12/9 = 1.33 (Approx.)
 There are 20 images of aeroplanes and helicopters. The machine identifies 12 images correctly and 3 images incorrectly for aeroplanes. Similarly 2 images correctly as helicopters. Find the accuracy of the model.
Ans.: No. of predictions made: 12 + 3 + 2 = 17
Total Correct predictions made: 12 + 2 = 14
 The prediction of the model is 1/4 and the recall of the same is 2/4. What is the F1 score of the model?
Ans. F1 score= 2 x(precision x recall /precision + recall)
= 2 x (1/4×2/4)/(1/4+2/4)
 Out of 300 images of Lions and Tigers, the model identified 267 images correctly. What is the accuracy of the model?
Ans.: Accuracy = Total Correct Predictions / The no. of predictions = 267/300 = 0.89
 There are 400 images of fruits the AI model is able to predict correctly so the accuracy of the model is exactly 0.5. How many correct predictions does the machine make?
Ans.: Accuracy = Total Correct Predictions / The no. of predictions
0.5 = Total Correct Predictions / 400
Total Correct Prediction = 400 x 0.5 = 200
So the total correct predictions made by machine are 200.
 The recall comes 0.65 and the precision 0.70 for an AI model. What is the F1 score based on these metrics?
Ans.: F1 score = 2 x (Precision x Recall) / (Precision + Recall) = 2 x(0.65 x 0.70) / (0.65 + 0.70)=0.67
 The recall comes 0.80 and the precision 0.40 for an AI model. What is the F1 score based on these metrics?
Ans.: F1 Score= 2x(Precision x Recall) / (Precision + Recall) = 2 x( 0.80 x 0.40) / (0.80 + 0.40) = 0.53
Watch this video for more understanding:
I have used this F1 score calculator to compute the F1 score. Follow the below-given link:
Unit 7 Evaluation Class 10 Artificial Intelligence – 2 Marks Questions
 Explain the precision formula.
Ans.:The formula for precision is:
Precision = (True Positive / All Predicted Positives) x 100% = (TP / (TP + FP) ) x 100%
In this formula, the percentage of true positive cases values versus all the cases where the prediction is true. It takes into account the True Positives and False Positives.
 Explain the recalled formula.
Ans.: The formula for recall is:
Recall = (True Positive / (True Positive + False Negative) ) x 100% = (TP / (TP + FN)) x 100%
In this recall formula, the fraction of positive cases which are identified correctly will be taken into consideration. Here All True Positive cases and False Negative cases will be considered.
 What is the importance of evaluation?
Ans.: Evaluation is required to examine a model critically. It make the judgements about a model to improve effectiveness and/or to inform programming decisions. It ensures that the model is working properly and optionally. It helps to regulate what works well and what could be better in a program. It is an initiative to appreciate how well it attains its goals.
 In which situation evaluation metric is more important for any case?
Ans.: F1 evaluation metric is more significant in any case. F1 score sort of upholds a balance between is high again F1 score is high. The F1 score is a number between 0 and 1 and is the harmonic mean of precision and recall. The formula to determine F1 score is:
F1 Score = 2 x ( Precision x Recall) / (Precision + Recall)
 Which value for the F1 score is the perfect F1 score? Explain with context.
Ans.: When both precision and recall had the value of 1 or 100% then the F1 score is also 1 or 100%. It is known as the perfect value for F1 score. The values for precision and recall falls between 0 and 1 likely the F1 score values also falls between 0 and 1.
 Explain the evaluation metrics for mail spamming.
Ans.: In mail spamming, if the machine predicts that any email is spam email, then the person will ignore that email. In this context sometime the person miss out the vital information. False positive condition would have high cost as predicting the email as spam while the mail is not a spam email.
 How evaluation metrics would be crucial for gold mining?
Ans.: A model predicts that there exists gem at a point and the individual keep on excavating there but it turns out that it is a false apprehension. False positive condition is very costly as predicting there is a treasure but in reality there is no treasure.
 How false-negative conditions will be hazardous in evaluation? Explain with an example.
Ans.: The false negative conditions can be hazardous because of sometime the model did not notice a condition which is very dangerous.
For example, a deadly virus has started scattering and the model which is supposed to forecast a viral outbreak, does not notice it. The virus might spread extensively and infect a lot of people.
 State and explain some possible reasons why AI model is not efficient.
Ans.: The possible reasons for AI model is not being effcient are:
1. Lacking of training data : This refers to the situation when data is not satisfactory as per requirement or data is not used properly means something is missing from it, the model won’t be efficient.
2. Unauthenticated Data/Wrong Data: For getting good results and predictions data must be authenticated and correct. Unauthenticated data may not help in getting the good results.
3. Inefficient Coding/Wrong Algorithm: For preparing a good model, coding or algorithm is necessary. The coding or algorithm should be written appropriately. If coding and algorithm is not accurate or appropriate the model will not generate the desired output.
4. Less Accuracy
 High accuracy is not usable. Justify this with an example.
Ans.: High accuracy refers to accuracy 99.9%. This means that a small mistake leads to high damage. High accuracy is very sensitive parameter for an AI model.
SCENARIO: An expensive robotic chicken crosses a very busy road a thousand times per day. An ML model evaluates traffic patterns and predicts when this chicken can safely cross the street with an accuracy of 99.99%.
Explanation: A 99.99% accuracy value on a very busy road strongly suggests that the ML model is far better than chance. In some settings, however, the cost of making even a small number of mistakes is still too high. 99.99% accuracy means that the expensive chicken will need to be replaced, on average, every 10 days. (The chicken might also cause extensive damage to cars that it hits.)
Watch this video for more understanding:
 High precision is not usable. Justify this with an example.
Example: Prediction of spam email whether it is spam or not.
In this case two conditions false positive and false negative may arise. The false positive refers to the mail is predicted as “spam” but it is not a “spam”. False negative means the mail is predicted as “Not Spam” but it is “Spam”. So many false negative condition will make the spam filter ineffective but false positive may cause important mails to be missed.
 Suppose, the AI model has to bifurcate volleyball and football. Volleyballs are positive cases and footballs are negative cases. There are 30 images of volleyball and 26 images of footballs. The model has predicted 28 out of 30 volleyball and 2 volleyball as football. It predicts 24 footballs and 2 footballs as volleyball. Compute accuracy and precision both.
Accuracy= correct predictions/total predictions made = (28 + 24) / (30 + 26 ) = 52/56 = 0.92
Precision = True Positives / (True Positives + False Positives) = 28 / (28 + 2) = 28 / 30 = 0.93
 There are 14 images of cows and buffalos. There are 8 images of cows and 6 images of buffalos. The model has predicted 5 cows and 4 buffalos. It identifies 1 cow as buffalo and 2 buffalos as cows. Compute the accuracy, precision, recall and F1 score.
Accuracy = CP / TP = (5 + 4) / (8 + 6) = 9/14= 0.64
Precision = TP / (TP + FP) = 5 / (5 + 2) = 5/7 = 0.71
Recall = TP / (TP + FN) = 5 / (5+1) = 5/6 = 0.83
F1 Score = (2 x precision x recall) / (precision + recall) = (2 x 0.71x 0.83) / (0.71 + 0.83) = 1.1786/1.54 = 0.77
Or F1 Score = (2 x (5/7) x (5/6)) / ((5/7)+(5/6)) = (50/42) / (65/42) = 10/13
 For a model, the F1 score is 0.85 and precision is 0.82. Compute the recall for the same case.
Ans.: In this questions F1 score is given and we have to find the recall value. So rearrange the formula as following:
(1/F1 Score=(1/2) x ((1/Precision)+(1/Recall)) )So
(1/recall=((2/F1 Score)-(1/Precision)) or recall = (F1 Score x Precision) / (2 x precision – F1 Score)
recall= (0.85 x 0.82)/ (2 x 0.82 – 0.85) = 0.697/0.79 = 0.88
 Out of 100 pictures of buses and trucks, 80 are actually buses and 20 are trucks. What is the minimum number of buses identified by a model to have a recall of 90% or more?
Ans.: Here the recall value is 90% i.e. >=0.9.
Now, recall = Fractions of True Positive. There 80 buses identified correctly by model. Hence the True Positive cases are 80. So required True Positive cases = 0.9 x 80 = 72. Hence 72 buses have to be correctly identified to have a recall value 90% or more.
 Out of 40 pictures of cats and rabbits, There are 25 rabbits, how many cats most the model needs to identify correctly along with 15 rabbits images correctly identified by the model, to achieve more than 75% accuracy?
Ans.: Lete there be x cats images correctly identified. Thus,
(15 + x) / 40 >= 0.75
or 15 + x >= 25
or x >= 10
Out of 40 images 25 are rabbits and 15 images already correctly identified as rabbit images. Hence it must correctly identify 10 or more than cats images of the 15 images.
 Draw a confusion matrix for the following:
|Positive/Negative: White Pages/ Yellow Pages||No. of images: 150|
|Number of actual white pages images: 90||True Positives: 85|
|False Positives: 20||False Negatives: 25|
Ans.: The confusion matrix is as follow:
|Positive (White Pages)||85||20|
|Negative (Yellow Pages)||5||25|
 Find the F1 score from the given confusion matrix:
Ans.: The confusion matrix is as follow:
Precision: TP / (TP + FP) = 44 / (44 + 6) = 44 / 50 = 0.88
Recall: TP / (TP + FN) = 44 / (44 + 15) = 44 / 59 = 0.746
F1 Score: 2 x precision x recall / (precision + recall) = 2 x 0.88 x 0.746 / (0.88 + 0.746) = 1.31296/1.626=0.81
Unit 7 Evaluation – Long Answer Questions (4 Marks)
 Shweta is learning NLP. She read about the F1 score but did not understand the need for the F1 score formulation. Support her by giving an answer.
Ans.: The F1 score is also known as the F score or F measure of AI model’s test accuracy. It calculated from the precision and recall of the test. Here
1) The precision is the number of correctly recognized positive results divided by the number of all positive results, including the those not identified correctly.
2) The recall is the number of correctly identified positive results divided by the number of all samples that should have been identified as positive.
3) The F1 score is defined as the harmonic mean of the test’s precision and recall. The formula of F1 score is as below:
F1 score = 2 x precision x recall / (precision + recall)
From this formula,
i) A good F1 score means that you have high true positives and high true negatives. It means that machine is correctly identifying real threats and not disturbs the users by false alarms.
ii) An F1 score is considered prefect when it’s 1 while the model is a total failure when it’s 0.
iii) F1 score is a better metric to evaluate the model on real-life classification problems and when imbalanced class distribution exists.
 Calculate accuracy, precision, recall and F1 score for the following Confusion Matrix. Suggest which metric would not be a good evaluation parameter and why?
a) Accuracy: It is defined as the percentage of correct predictions out of all the observations.
Accuracy = (Correct Predictions / Total Cases) X 100% = ((TP + TN) / (TP + TN + FP + FN)) x 100%
= ((50 + 30) / (50 + 30 + 15 + 25) ) x 100% = (80 / 120) x 100% = 0.67
b) Precision: It is defined as the percentage of true positive cases versus all the cases where the prediction is true.
Precision = (True Positive / All predicted positives) X 100% = TP / (TP + FP) x 100%
= 50 / (50 + 30) = 50/80 = 0.625
c) Recall: It is defined as the fraction of positive cases that are correctly identified.
Recall = True Positive / (True Positive + False Negative) = TP / (TP + FN)
=50 / (50 + 15) = 50 / 65 = 0.769
d) F1 Score: It is identified as the measure of balance between precision and recall.
F1 Score= (2 x precision x recall0 / (precision + recall) = 2 x 0.625 x 0.769 / (0.625 + 0.769)
= 0.96125 / 1.394 =0.7
Accuracy = 0.67
Precision = 0.625
Recall = 0.769
F1 Score = 0.7
By this results, recall is not good evaluation metric it needs to improve more. Two conditions are very risku here:
1) False Positive: Suppose if we see the above confusion matrix for the heart attack cases then a person is predicted with heart attack but does not have heart attack in reality.
2) False Negative: A person is predicted with no heart attack but in reality person is suffering from the heart attack.
Hence False negative miss the actual patients. So recall needs more improvement.
Watch this video for more understanding:
I hope this article will help you to prepare well for the board examinations. If you have any doubts or queries feel free to share in the comment section.
Thank you for reading this. Share this article with your friends and classmates to help them.