Comprehensive notes python pandas IP class 12

In this article, you will learn python pandas IP class 12. As per the revised curriculum of CBSE Class 12 Informatics Practices, we will cover DataFrame basics and Creating dataframe in this part. So let us begin:

Introduction to DataFrame – Python pandas IP class 12

Observe the following picture:

As you are familiar with Pandas – Series in the previous post, DataFrame is another important data structure.

In the data structure, data can be represented as a one-dimensional data structure or two-dimensional data structure. Pandas Series represents one dimension data structure, similarly, DataFrame represents a two-dimensional data structure. 

When there is a thought of two dimensions, consider MS Excel as the best example of two-dimensional data representation. It represents data in tabular form in rows and columns. 

DataFrame can be divided into two simple words: i) Data and ii) Frame. So we can say that data can be surrounded in a frame of rows and columns. It can store any type of data within the frame. DataFrame is widely used to analyze big data.

In the above image 2D array is represented, which will be determined by m x n, where m=rows and n=cols. So according to the above example, we have a 2D array of 3 x 5 with 15 elements.

In the next section of python pandas IP class 12 we will discuss characteristics of a dataframe.

Characteristics of DataFrame

  1. DataFrame has two indexes/axes i.e row index & column index
  2. In DataFrames indexes can be numberes/letters/strings
  3. DataFrame is a collection of different data types
  4. DataFrame is value mutable i.e. values can be changed
  5. DataFrame is also size mutable i.e. indexes can be added or deleted anytime

Now you are familiar with DataFrame, so in the next section of python pandas IP class 12 we will see how to create a dataframe:

Creating DataFrame

To create DataFrame following modules should be imported, where pandas is mandatory as numpy is used according to need.
import pandas as pd
import numpy as np

Syntax:
dfo = pandas.DataFrame(<2D DataStructure>, <columns=column_sequence>,<index=index_sequence>,<dtype=data_type>,<copy=bool>)

Where

dfo refers to an object instantiated variable to DataFrame 

pandas refer to instantiated object imported through import object, generally, pd is an object alias name in programs 

DataFrame() is a function that create a DataFrame 

2D DataStructure: This is first and mandatory parameter of DataFrame function which can be a list, a series, dictionary, a NumPy ndarray or any other 2D datasturtcure 

columns: It is an optional parameter of DataFrame function that specifies columns used in DataFrame, by default it starts with 0. 

index: It is also an optional parameter of the DataFrame function that specifies rows used in DataFrame, by default it starts with 0. 

dtype: It specifies datatype of DataFrame elements, it is also an optional part. If dtype is not specified then it accepts none. 

Now in next section of python pandas IP class 12 we will see how to create dataframe with various options:

Creating empty DataFrame & Display

To create an empty DataFrame , DataFrame() function is used without passing any parameter and to display the elements print() function is used as follows:

import pandas as pd
df = pd.DataFrame()
print(df)

Creating DataFrame from List and Display (Single Column)

DataFrame can be created using list for a single column as well as multiple columns. To create single column DataFrame using list declare and define a list and then pass that list object to DataFrame() function as following:

import pandas as pd
l =[5,10,15,20,25]
df = pd.DataFrame(l)
print(df)

Have a look at creating dataframe from list and display them with multiple columns from python pandas IP class 12.

Recommended: Informatics Practices

Creating DataFrame from List and Display (Multiple Columns)

Let’s have look on following code that creates multiple columns DataFrame using a list:

import pandas as pd
l=[['Ankit',72,65,78],['Mohit',60,67,65],['Shreya',80,86,83]]
df=pd.DataFrame(l)
print(df)

Next section of python pandas IP class 12 will focus on specifying columns.

Specifying column names

To specify column names use columns parameter and specify the names of columns as following in DataFrame() fuction:

import pandas as pd
l=[['Ankit',72,65,78],['Mohit',60,67,65],['Shreya',80,86,83]]
df=pd.DataFrame(l,columns=['Name','English','Maths','Physics'])
print(df)

Creating DataFrame from series

As you learned series in an earlier post, DataFrame can be also created from series. In the following example, two series objects created to store player statistics in two different series and then DataFrame() function is used, have a look:

import pandas as pd
player_matches = pd.Series({'V.Kohli':200,'K.Rahul':74,'R.Sharma':156,'H.Padya':80})
player_runs=pd.series({'V.Kohli':95878,'K.Rahul':3612,'R.Sharma':7863,'H.Padya':2530})
df = pd.DataFrame({'Matches':player_matches,'Runs':player_runs})

Creating DataFrame from Dictionaries

Dictionary objects are also 2D data structure and can be passed to DataFrame() function. Users can create DataFrame from the dictionary of Series and a list of dictionaries. Following example display DataFrame created from the dictionary of Series:

import pandas as pd
player_stats={'Name':['V.Kohli','K.Rahul','R.Sharma','H.Pandya'],'Matches':[200,74,156,80],'Runs':[9587,3612,7863,2530]}
df = pd.DataFrame(player_stats)
print(df)

In the next section of python pandas IP class 12 we will discuss Creating dataframe using a list of dictionaries.

Creating DataFrame using a list of dictionaries

List of the dictionary is a list having multiple dictionary objects, if any value is missed in dictionary specification then NaN (Not a Number) will be displayed in the output. Let’s take a look in the following example:

import pandas as pd
players=[{'V.Kohli':107,'K.Rahul':120,'R.Sharma':78,'H.Pandya':30},\
{'V.Kohli':35,'R.Sharma':175,'H.Pandya':58},\
{'V.Kohli':60,'K.Rahul':32,'H.Pandya':30}
]
df = pd.DataFrame(players)
print(df)

Creating DataFrame from ndArrays

To create DataFrame using ndArrays, nd Array should be created by importing NumPy module. Let’s have a look into the following example:

import pandas as pd
import numpy as np
a = np.array([[10,20,30],[77,66,55]].np.int32)
df = pd.DataFrame(a)
print(df)

So here we covered all the concepts given in your revised syllabus for python pandas IP class 12.

2 thoughts on “Comprehensive notes python pandas IP class 12”

Comment Your Views

%d bloggers like this: