How To Create Dataframes In Python

How To Create Dataframes In Python

A Data frame is a two-dimensional data structure which stores the data in Tabular Fashion like row and column format.Panadas Dataframe is a powerful component to store the data for the Data Analytics purpose.Panadas Dataframe provides various powerful features to manipulate,restructure and also to do various mathematical calculation of the data inside the Dataframe for Data Analaytics purpose

Features of DataFrame

Size – Data Frames are type of Mutable means the data under the dataframes can be manipulated or modified

It have labled axis like row and in column format

It can Perform Arithmetic operations on rows and columns\

Pandas Dataframe can look like below table where the student's data are stored in the row and column format

Pandas Dataframe


Basic Components Of Creating Panads DataFrame

A pandas DataFrame can be created using the below constructors −

pandas.DataFrame( data, index, columns, dtype, copy)


Specification Of Components For Pandas Dataframe

Data

Data could be in various forms like ndarray, series, map, lists, dict, constants and also as another DataFrame.

Index

For the row labels, the Index to be used for the resulting frame is Optional Default np.arange(n) if no index is passed to the dataframe.

Columns

For column labels, the optional default syntax is - np.arange(n). This is only true if no index is passed.

Dtype

Data type of each column in the dataframe.

Copy

This command (or whatever it is) is used for copying of data in the Dataframe, by default it is False.


Types Of Inputs To Create DataFrame


A pandas DataFrame can be created using various types of inputs like −

1.Lists
2.dict
3.Series
4.Numpy ndarrays
5.Another DataFrame


How To Create An Empty Dataframe

To create a dataframe in python you first have to import the pandas dataframe

import pandas as pd

df = pd.DataFrame()

print df

Output:

It will create an Empty DataFrame

Columns: []

Index: []

How To Create a DataFrame from Lists

The dataframe is created from a list of values.

import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df

Output:

     0
0    1
1    2
2    3
3    4
4    5

Dataframe can be created from a List of Lists also.Also you can assign the column names in the dataframe.

import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df

Output:

      Name      Age
0     Alex      10
1     Bob       12
2     Clarke    13

You can define the datatype of the Dataframe also by using the dtype attribute


import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print df

Output:

      Name     Age
0     Alex     10.0
1     Bob      12.0
2     Clarke   13.0



How To Create a DataFrame from Dict of ndarrays / Lists

To create a dataframe from ndarrays all the ndarrays must be of same length. If index is passed, then the length of the index should equal to the length of the arrays.

If no index is passed, then by default, index will be range(n), where n is the array length.

import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print df

Output:

      Age      Name
0     28        Tom
1     34       Jack
2     29      Steve
3     42      Ricky

Note − Observe the values 0,1,2,3. They are the default index assigned to each using the function range(n).



Comments