Tuesday, June 29, 2021

XII CS: Data File Handling

 

Data File Handling

Introduction: 

Files: A file (or data file) is a stream or sequence of characters/data occupying named place on the disk where a sequence of related data is stored.

The programs we have seen so far are transient, i.e., they run for a short time and produce some output, but when they end, their data disappears. If you run the program again, it starts with a clean slate. This happens because the data entered is executed inside primary memory, RAM, which is volatile (temporary) in nature.

Other programs are persistent: they run for a long time (or all the time); the keep at least some of their data in permanent storage(for example, a hard drive); and of they shut down or restart, they execute from the same point Examples of persistent programs are operating systems, which run whenever a computer is switched on, and web servers, which run all the time, waiting for requests to come in on the network.

One of the simplest ways for programs to maintain their data is by reading and writing text files.

Programs that are used in day-to-day business operations rely extensively on files. Payroll programs keep employee data in files; inventory programs keep data about a company’s products in files, accounting systems keep data about a company’s financial operations in files, and so on.

Thus, a file is a document of the data stored on a permanent storage device which can be read, written or rewritten according to requirement. In other words, data is packaged up on the storage device as data structures called Files. All files are assigned a name that is used for identification purposes by the operation system and the user.

File I/O (Input-Output) means transfer of data from secondary memory (hard disk) to main memory or vice versa. As shown in the figure,  when we are working on a computer system, the file or the document  stored on the hard disk is brought into RAM (Random Access Memory)for execution and vice versa.


Need of Files:

Data maintained inside the files is termed as persistent data, i.e., it is permanent in nature. Python allow us to read data from and save data to external text files permanently on secondary storage media,

Data is stored using file permanently on secondary storage media. We usually use the files to store data permanently while working and storing data in Word processing applications, spreadsheets, presentation applications, etc. All of these operations require data to be stored in files so that it may be used later.

Thus, files provide a means of communication between the program and the outside worlds.

In a nutshell, a file is a stream of bytes, comprising data of interest.

Data file operations

Before we start working with a file, first we need to open it. After performing the desirable operation it needs to be closed so that resources that are tied with the file are freed.

Thus, Python file handling takes place in the following order 

  • Opening a file.
  • Performing operations (read, write etc.) or Processing Data
  • Closing the file. 

Using the above-mentioned basic operations, we can process file in several ways, such as: 

  • Creating a file 
  • Traversing a file for displaying data on screen 
  • Appending data in a file 
  • Inserting data in a file 
  • Deleting data from a file 
  • Creating a copy of a file
  • Updating data in a file, etc.

File types

 

Python allows us to create and manage two types of files:

  • Text File
  • Binary File

Text file : a text file consists of a sequence of lines. A line is a sequence of characters (ASCII or UNICODE), stored on permanent storage media. Although default character coding in Python is ASCII, using the constant ‘u’ with string , it supports Unicode as well. In a text file, each line is terminated by a special character, known as End of Line (EOL). By default, this EOL character is the newline character (‘\n’). so, at the lowest level, text file will be a collection of bytes. Text files are stored in human readable form and can be created using any text editor.
We use text files to store character data. For example, test.txt
 

Binary files: binary files are used to store binary data such as images, video files, audio files, etc. A binary file contains arbitrary binary data usually numbers stored in the file which can be used for numerical operation(s) . so, when we work on a binary file, we have to interpret the raw bit pattern(s) read from the file into correct type of data in our program.

In binary file, there is no delimiter for a line. Also, no character translations can be carried out in a binary file, as a result binary files are easier and much faster than text file for carrying out reading and writing operations on data.

It is perfectly possible to interpret a stream of bytes originally written as string as numeric value, but that will be an incorrect interpretation of data and we may not get the desired output after the file processing activity. So, in the case of binary file, it is extremely important that we interpret the correct data type while reading the file. Python provides special, module(s) for encoding and decoding of data for binary file.

opening and closing files

 To handle data files in python, we need to have a file variable of file object or file handle. Object can be created by using open() function or file() function. To work on a file, the first thing we do is open it. This is done by using the built-in function open(). Using this function, a file object created which is then used for accessing various methods and functions available for file manipulation

Open() – opening a file

When we want to read or write a file, we must first open the file. Opening the file communicated with the operating system, which knows where the data for each file stored. When we open a file, we are asking the operating system to find the file by name and make sure the file exists. 

In the example given below, we open the file test.txt, which should be stored in the same folder that we are in when we start python.

Open() function takes the name of the file as the first argument. The second argument indicates the mode of accessing the file. The syntax for open() is:

Syntax:

<File variable> / <file object or handle> =open (file _name, access mode)

Here, the first argument with open() is the name of the file to be opened and the second argument describes the mode, i.e., how the file will be used throughout the program. This is an optional parameter as the default mode is the read mode (reading).

Modes for opening a file:

·        Read(r): to read the file

·        Write (w): to write to the file

·        Append (a): to write at the end of the file

The object of file type is returned using which we will manipulate the file in our program.  When we work with file(s), a buffer (area in memory where data is temporarily stored before being written to the file) is automatically associated with the file when we open it. While writing the content to a file, first it goes to buffer, and once the buffer is full, data is written to the file. 

Also, when the file is closed, any unsaved data is transferred to the file. 

Flush() function is used to force transfer of data from buffer to file.

If the opening is successful, the operation system returns us a file handle .  the file handle is not the actual data contained in the file; instead, it is a “handle” that we can use to read the data. We are given a handle if the requested file exists and we have proper permission to read to file.

Example 1  >>> f = open(‘test.txt’)

                 >>> print (f)

from the above example, when we open a file, it opens in read mode by default and ‘f’ as the file object or file variable or file handle, whatever we may say shall be returned as the output b the operation system.

Note : when you open a file in read mode, the given file must exist in the folder, otherwise python will raise FileNotFoundError.

Now, in the other situation in which the file does not exist, open() will fail with a trace back and we will not get a handle to access the contents of the file.


File Modes

The second parameter of the open() function corresponds to a mode which is either read (‘r’), write (‘w’), or append (‘a’). the file mode defines how the file will be accessed. The modes suffixed with ‘b’ represent binary files.


mode

Description

r

Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the default mode. If the specified file does not exist, it will generate File Not Found Error.


rb

Opens a file for reading only in binary format. The file pointer is placed at the beginning of the file. This is the default mode.


r+

Opens a file for both reading and writing. (+) the file pointr will be at the beginning of the file.


rb+

Opens a file for both reading and writing in binary format. (+) the file pointer will be at the beginning of the file.


w

Opens a file for writing only. Overwrites the file of the file exist. If the file does not exist, it creates a new file for writing.


wb

Opens a file for writing only in binary format. Overwrites the file of the file exist. If the file does not exist, creates a new file for writing


w+

Opens a file for both reading and writing. Overwrites the existing file. If the file exist. If the file does not exist, creates a new file for reading or writing.


wb+

Opens a file for both reading and writing in binary format. Overwrites the existing file. If the file exist. If the file does not exist, creates a new file for reading or writing.


a

Opens a file for appending. The file pointer is at the end of the fie if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.


ab

Opens a file for appending in binary format. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.


a+

Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The  file is in the append mode. If the file does not exist, it creates a new file for reading and writing.


ab+

Opens a file for both appending and reading in binary format. The file pointer is at the end of the file if the file exists. The  file is in the append mode. If the file does not exist, it creates a new file for reading and writing.

Example 2:

>>>file =open(“test.txt”, ‘”r+”)

Will open the file ‘test.txt’   for reading and writing purpose.  Here,  the name (by which it exists on secondary storage media) of the file specified is constant.

We can use a variable instead of a constant as name of the file, test file; if it already exists, then it has to be in the same folder where we are working now, otherwise we have to specify the complete path.  

It is not mandatory to have file name with extension. In the above example, .txt extension is used for our convenience of identification as it is easy to identify the file as a text file. Similarly, for binary file we will use .dat extension.

Another function which can be used for creation of a file is file(). Its syntax and its usage is same as open().

Close()-  closing a file

The close() method of a file object flushes any unwritten information and closes the file object, after which no more writing can be done. Python automatically closes a file when the reference object of a file is reassigned to another file. It sis a good practice to use the close() method to close a file.

Syntax:

File object.close()

A close() function breaks the link of file object and the file on the disk.

After closing a file (using close()), no tasks can be performed on that file through the file-object.

Example 3:

>>>f =open(‘test.txt’)


>>>print (“The name of the file to be closed is:”,f.name)

       

The name of the file to be closed is : test.txt


>>>f.close()



In the above example, we have used ‘name’ property of the file object ‘f’ along with print() Statement, which will return the name if the currently used file, which is ‘test.txt’ in this case. Let us discuss these properties of File Object.

 

Various properties of File Object:

Once open() is successful and file object gets created, we can retrieve various details related to that file using its associated properties.

1. name:  Name of the opened file.

2. mode:  Mode in which the file gets opened.

3. closed:  returns Boolean value, which indicates whether the file is closed or not.

4. readable: returns Boolean value, which indicates whether the file is readable or not


Reading from a file


Python provides various methods for reading data from a file. We can read character data from text file by using the following read methods:

a) read(): To read the entire data from the file; starts reading from the cursor up to the end of the file.

b) read(n): To read ’n’ characters from the file, starting from the cursor; If the file holds fewer than ‘n’ characters, it will read until the end of the file.

c)  readline(): To read only one line from the file; starts reading from the cursor up to, and including, the of the line character.

d) readlines():   To read all lines from the file into a list; starts reading from the cursor up the end of the file and returns a list of lines.

Let us understand these methods with the help of suitable examples using a text file ‘test.txt’.

Example1:  read() by reading the entire data from a file (test.txt)

 f=open('test.txt','r')
 data=f.read()
 print(data)
 f.close()

Example2:  read(n) by reading the entire data from a file (test.txt)

  f=open('test.txt','r')
  data=f.read(10)
  print(data)
  f.close()


Example3:  to read data line by line using readline() method

  f=open('test.txt','r')
  line1=f.readline()
  print(line1,end=' ')
  line2=f.readline()
  print(line2,end=' ')
  f.close()

   Example4:  to read all the lines from the file into list.

  f=open('test.txt','r')
  lines=f.readlines()
  for line in lines:
     print(line,end=' ')
  f.close()
  Example 5:  to read  data starting from 2 character into a list

  f=open('test.txt','r')
  print(f.read(2))
  print(f.readlines())
  print(f.read(3))
  print ("Remaining data")
  f.close()

Writing to File

  We can write character data into a file in Python by suing following two methods:

1.    write(string)

2.    writerlines (sequence of lines)


1.    write(): write() method takes a string (as parameter) and writes it in the file. For storing data with end of line character, we will have to add ‘\n’ character to the end of the string. Notice the addition of ‘\n’ at the end of every sentence while talking of data.txt. As argument to the function has to be string, for storing numeric value, we have to convert it to string.

 

Syntax:

         Fileobject.write(string)


Example: 

#Program to write data to the file


f=open(“test2.txt”,”w”)

f=write(“We are writing \n”)

f=write(“data to a\n”)

f=write(“text file\n”)


print(“Data written to the file successfully”)

f.close()


    Output:

     We are writing

     Data to a

     Text file



#Program to write numeric data to a file.


f=open(“newtest.txt”,”w”)

x=100

f.write(Hello Word \n”)

f.write(str(x)) #Numeric value is converted into string

f.close()


>>> I    

              Blinking cursor indicates the successful write operation in the file.

Note: While writing data to a file using the write() method, we must provide line separator (‘\n’-new line feed), otherwise the entire data to be written shall be written to a single line.


Writelines(): for writing a string at a time, we use write() method; it can’t be used for writing a list, tuple, etc., into a file. Sequence data type including strings can be written using writelines() method in the file.

Syntax:
Fileobject.writelines(sequence)


So, whenever we have to write a sequence of string/data type, we must use writelines() instead of write().


#Program to illustrate writelines() method 
#for writing list into the file
f=open(“test4.txt”,”w”)
list =[“Computer Science\n”, “Physics\n”,”Chemistry\n”,”Maths”]
f.writelines(list)
print(“List of lines written to the file successfully””)
f.close()


  Output:

     List of lines written to the file successfully


To see output open test4.txt file in notepad.

 Note: 

 While reading from or writing to the file, the cursor always starts from the    beginning  of the file.


Also to be noted here is that writelines() method does not add any EOL character to the end of string. We have to do it ourselves. So, to resolve this problem, we have used’\n’ new line character(in the program) after the end of each list item or string.

You must have noticed that till now we have used close() method in all the programs to close the file in the end. 
In case you don’t  want to close your file explicitly using close(), there is an alternative statement which can be used in the program, i.e., ‘with’ statement which we will discuss now:

With statement

Apart from using open() or file() function for creation of file, with statement can also be used for the same purpose. We can use this statement to group file operation statements within block. Using with ensures that all the resources allocated to the file objects get deallocated automatically once we stop using the file. In the case of exceptions also, we are not required to close the file explicitly using with statement. Its syntax is:


   Syntax:
    With open() as file object:
 
                   File manipulation statements

  Example


#Program to illustrate ‘with’ statement 
With open (“test1.text”,”w”) as f:
f.write(“Python\n”)
f.write(“is an easy\n”)
f.write(“language\n”)
f.write(“to work with\n”)
print(“Is file closed: ”, f. closed)
print(“Is file closed: ”, f. closed())

OUTPUT: 
Is file closed : False
Is file closed : None

Python 
is an easy
language
to work with


APPENDING TO FILE


Append means ‘to add to’; so if we want to add more data to a file which already has some data in it, we will be appending data. In such a case, use the access mode ‘a’, which means:


 ‘open for writing, and if it exists, then append data to the end file’.


In python, we can use the ‘a’ mode to open an output file in append mode. This means that:

  • Ø  If the file already exits, it will not be erased. If the file does not exist, it will be created.
  • Ø  When data is written to the file, it will be written at the end of the file’s current contents.

      Syntax:

     <file _object>=open(<filename>,’a’)


Here, ‘a’ stands for append mode. Which allows to add data to the existing data instead of overwriting in the file.


For example:

>>>f=open(“test1.txt”,”a”)


#Program to add data to existing data in the file 

 f.open(“test.txt”,’a’)         #opening file in append mode

 f.write(“simple syntax of the language\n”)

 f.write(“marks Python programs easy to read and write”)

 print(“More Data appended to the file ”)

 f. closed()


Output: 


More data appended to the file


Contents of the text file “test.txt”:


Hello user

you  are working with

python

files

Simple syntax of the language

Make Python programs easy to read and write


BINARY FILE OPERATIONS



If we wish to write a structure such as a list or dictionary to a file and read it subsequently, we need to use the Python module pickle


Pickling refers to the process of converting the structure to a byte stream before writing to the file. 


While reading the contents of the file, a reverse process called Unpicking is used to convert the byte stream back to the original structure.


We know that the methods provided in Python for writing/reading a file work with string parameters. So when we want to work on binary file, conversion of data at the time of reading as well as writing is required. 

Pickle module can be used to store any king of object in file as it allows us to store Python objects with their structure. So for storing data in binary format, we will use pickle module.


First we need to import the module. 

It provides two methods for the purpose- dump and load. 

For creation of a binary file we will use pickle.dump() to write the object in file, which is opened in binary access mode. 


Syntax of dump() method is:

Dump(object,fileobject)


#Program to write list sequence in a binary  file 

Def foperation():

   import pickle

   List1= [10’20’30’40’100]

   f.open(‘list.data’,’wb’) #’b’ in access mode represents binary file

   pickle.dump(list1,f)  #writing contents to binary file

print(“list added to binary the file ”)

f. close()

foperation()



#Program to write dictionary to a binary  file 

       import pickle

    dict1= {‘Python’ : 90,'Java’: 95,‘C++’ : 85}

    f.open(‘Bin_file.dat’,‘wb)          

    pickle.dump(dict1,f) 

    f. close()



Once data is stored using dump()m it can then be used for reading. For reading data from a file, we have to use pickle.load() to read the object from pickle file.

Syntax of load() is :

 

Object=load(fileobject)


Note:- we need to call load() each time dump() is called.


#Program to to read python dictionary contents back from the file  file 

       import pickle

       f.open(‘Bin_file.dat’, ‘rb)          

       doct1=pickle.load(f)# reading data from binary file       

       f. close()

       print (dict)


Most of the files the we see in our computer system are called binary files.

Example:

·        Document files: .pdf, .doc, .xls, etc.

·        Image file: .png, .jpg, .gif, .bmp, etc.

·        Video files:  .mp4, .3gp, .mkv, .avi, etc.

·        Audio files: mp3, .wav, .mka, .aac, ctc

·        Database files: .mdb, .accde, .frm, .sqlite, etc.

·        Archive files: .zip, .rar, .iso, .7z, etc.

·        Executable files : .exe, .dll, .class, etc.


All binary files follow a specific format. We can open some binary files in the normal text editor but we cannot read the content present inside the file. This is because all binary files are encoded in the binary format which can be understood only by a computer or a machine.

In binary files, there is no delimiter to end a line. Since they are directly in the form of binary hence there is no need  to translate them. That is why these files are easy to work with and fast.

The four major operations performed using a binary file such as-

1.    Inserting/Appending a record in a binary file

2.    Reading records from a binary file

3.    Searching a record in a binary file

4.    Updating a record in a binary file


Assuming that we have a  “ student “ file with the fields Roll_no, name and marks.

Inserting/Appending a record in a binary file

Inserting or adding (appending) a record into a binary file requires importing pickle module into your program followed by dump() method to write onto the file.


#Program to inserting/appending  record in a binary file-student


import  pickle

record= []

while  true :

     roll_no = int (input(“Enter student Roll no.  :”))

     name = input(“Enter student name:”)

     marks = int (input(“Enter the marks ibtained  :”))

     data = [roll_no, name, marks] 

     record.append(data)

     choice = input (“wish to enter more records (Y/N) ?:”)

     if choice.upper() == ‘N’ :

         break

f =  open (“student”,”wb”)

pickle.di,[ (record,f)

print (“Record Added”)

f.close()


#Program to read a record from the binary file- “student.dat”


import  pickle

f=open(“student”,”rb”)

stud_rec=pickle.load(f)       #To read the object from the opened file

print(“Contents of student file are :”)

#reading the fields from the file

for R in stud_rec:

     roll_no=R[0]

     name=R[1]

     marks=R[2]

     print (roll_no, name,marks)

f.close()


#Program to search a record  from the binary file- “student.dat”


import  pickle

f=open(“student”,”rb”)

stud_rec=pickle.load(f)       #To read the object from the opened file

found = 0

rno=int(input(“Enter the roll number to search:”))

for R in stud_rec:

     if R [0] == rno:

          print(“successful search”, R[1], “Found!”)

          found[0]

          break

if found == 0:

     print (“sorry, record not found”)

f.close()


#Program to update the name of the student  from the binary file- “student.dat”


import  pickle

f=open(“student”,”rb+”)

stud_rec=pickle.load(f)       #To read the object from the opened file

found = 0

rollno=int (input(“Enter the roll number to search:”))

for R in stud_rec:  

     rno=  R [0] 

     if rno== rollno:

    print(“current name is”, R[1] )

    R[1]= input (“New Name:”)

    found =1 

    break

if found == 1:

    f.seek(0)    #Taking the file pointer to the beginning of the file

    pickle.dump (stud_rec,f)

    print (“Name Updated ! ! !”)

f.close()



RELATIVE AND ABSOLUTE PATHS


File are organized into directories (also called “folders”). Every running program has a  “current directory”, which is the default directory for most operations. 

For example, while opening a file for reading,Python looks for it in the current directory.

The os module provides functions for working with files and directories (“os” stands for  “operation system”). Os.getcwd returns the name of the current directory


>>>import os

>>> cwd=os.getcwd()

>>>print(cwd)

Files are always stored in the current folder/directory by default. 

The os (operation system) module of Python provides various methods to work with file and folder/directories. 

For using these function, we have to import os module in our program.

cwd stands for “current working directory”.


A string like cwd that identifies a file is called a path.

 

A relative path starts from the current directory, whereas 


an absolute path starts from the topmost directory in the file system. 


For example, the text file we have created in the previous programs was opened though the absolute path.

>>>import os

>>>cwd=os.getcwd()

>>>print (cwd)

C:\user\KVD\AppData\Local\Programs\Python|Python36-32



Alternatively,

>>>f=open(“test.txt”)


\\test.txt is the Relative File Path


Note: The Python program and external file must be in the same directory, else we will need to enter the entire file path.


STANDARD FILE STREAMS



We use file object(s) t work with data file; similarly input/output from standard I/Odevices is also performed using standard I/O stream object. Since we use high-level functions for performing input/output through keyboard and moitor, such as :-

eval(), input() and print statement, we are not required to explicitly use I/O stream object.

The standard streams available in Python are:

  • Standard input stream,
  • Standard output stream, and 
  • Standard error stream.


These standard stream are nothing but file objects, which get automatically connected to your program’s standard device(s) when we start Python, in order to work with standard I/O stream, we need to import sys module. The methods which are available for I/O operations in it are read()

For reading a byte at a time from keyboard write() for writing data on console, i.e., monitor.

The three standard streams are described as follows:

1.sys.stdin: when a stream reads from standard input.


2.sys.stdout: data written to sys.stdout typically appears on your screen, can be                          linked to the standard input of another program with a pipe.


3.Sys.stderr: Error messages are written to sys.stderr.



#Program to implement standard streams

 import sys

F1=open(r”test.txt”)

Line1 = f.readline()

Line2 = f.readline()

Line3 = f.readline()

sys.stdout.write(line1)

sys.stdout.write(line2)

sys.stdout.write(line3)


The lines containing the method stdout.write() shall write the respective lines (form the file ‘test.txt’) on device/file associated with sys.stdout, which is the monitor.

#Program to to copy the contents of a file to another file


import os

def fileCopy(file1,file2) :

     f1=open(file1, ‘r’)

     f2=open(file1, ‘w’)

     line = f.readline()

     while line != ‘ ‘ :

          f2.write(line)   #write the line from f1 with additional newline 

          line = f.readline()

     f1close()

     f2close()

def main() :

     fileName1=input(‘Enter the source file name: ‘)

     fileName2=input(‘Enter the destination  file name: ‘)

     filecopy(fileName1, fileName2)

 if_name_== ‘_main_’ :

     main()


    RANDAM ACCESS IN FILE USING TELL() AND SEEK()



Python allow random access of the data as well using built-in methods seek() and tell().

Seek()- seek() function is used to change the position of the file handle (file pointer) to a given specific position. File pointer is like a cursor, which defines from where the data has to be read or written in the file.
Python file method seek() sets the file’s current position at the offset. This argument is optional and defaults to 0, which means absolute file positioning.
Other values are: 
1: which signifies seek inrelative (may change) to the current position, 
2: which means seek is relative to the end of file.
There is no return value.
The reference point is defined by the “from_what” argument. It can have any of the three values:
0:sets the reference point at the beginning of the file, which is be default.
1: sets the reference point at the current file position.
2:sets the reference point at the end of the file.
Seek() can be done in two ways:
  • Absolute positioning
  • Relative Positioning
Absolute referening using seek() gives the file number on which the file pointer has to position itself. 

The syntax for seek() is-

f.seek(file_location)       #where f is the file pointer

for example, f.seek(20) will give the position of file number where the file pointer has been placed.
This statement shall move the file pointer to 20th byte in the file no matter where you are.

Relative referencing/positioning has two arguments, offset and the position from which it has to traverse. 
The syntax for relative referencing is :

f.seek(offset, from_what)          #where f is file pointer

for example 
f.seek(-10,1) from current positon, move 10 bytes backword 
f.seek(10,1) from current positon, move 10 bytes forword
f.seek(-20,1) from current positon, move 20 bytes backword
f.seek(10,0) from beginning of file, move 10 bytes forword

tell()- tell() returns the current position of the file read/write pointer within the file. Its syntax is:
f.tell()      #where f is file pointer
when you open a file in reading/writing mode, the file pointer rests at 0th  byte. 
When you open a fiel in append mode, the file pointer rests at the last byte.

#illustrating seek() and tell()
#reading byte by byte
f=open(“test.txt”)
print(“Before reading:”, f.tell())
s=f.read()
Print(“after reading:”, f.tell())
f.seek(0)
#brings file pointer to 0th byte so data can
#be written/read from the next byte
print(“from the beginning again:”, f.tell())
s=f.read(4)
print(“first 4 bytes are:”, f.tell())
print(f.telll())
s=f.read(3)
print(“next 3byte:”, s)
print(f.telll())
f.close()



INTRODUCTION TO CSV


CSV is a simple flat file in a human readable format which is extensively used to store tabular data, in a spreadsheet or database. A CSV file stores tabular data (number and text) in plain text.

Files in the CSV format can be imported to and exported from programs that store data n tables, such as Microsoft Excel or Open office Calc.

CSV stands for “comma separated values”. Thus, we can say that a comma separated file is a delimited text file that uses a comma to separate values.

Each line in a file is known as data/record. Each record consists of one or more fields, separated by commas (also known as delimiters), i.e. each of the records is also a part of this file. Tabular data is stored as text un a CSV file. the use of comma as a field separator is the source of the name for this file format. It stores our data into a spreadsheet or a database.

CSV files are commonly used because they are easy to read and manage, small in size, and fast to process/transfer. Because of these salient features, they are frequently used in software applications, ranging anywhere from online e-commerce stores to mobile apps to desktop tools. For example, magento, an e-commerce platform, is known for its support of CSV.
Thus, in a nutshell, the several advantages that are offered by CSV.
•CSV is faster to handle.
•CSV is smaller in size.
•CSV is easy to generate and import onto a spreadsheet or database.
•CSV is human readable and easy to edit manually.
•CSV is simple to implement and parse.
•CSV is processed by almost all existing applications.

CSV FILE HANDLING IN PYTHON

For working with CSV files in Python, there is an inbuilt module called CSV. It is used to read and write tabular data in CSV format.
Therefore, to perform read and write operations with CSV file, we must import CSV module.CSV module can handle CSV files correctly regardless of the operation system on which the files were created.
Along with this module, open() function is used to open a CSV file and return object. We load the module in the usual way using import:

>>>import CSV

Like other files in Python, there are two basic operations that can be carried out on a CSV file:
1.Reading form a CSV file.
2.Writing to a CSV file.

Reading from a CSV file is done using the reader object. 

The CSV file is opened as a text file with Python’s built-in open() function, which returns a file object. 
This creates a special type of object to access the CSV file, using the reader() function.
The reader object is an iterable that gives us access to each line of the CSV  file as a list of fields. 
You  can also use next() directly on it to read the next line of the CSV file, or 
you can treat it like a list in a for loop to read all the lines of the file.

#Python program to read “student.csv”file contents.
import csv
f=open (“student.csv”,”r”) #opens file n read mode
csv_reader=csv.reader(f)  #reader() function 
#csv_reader is the csv reader object
#reading the student file record by record 
for row in csv_reader:
    print(row)
f.close()#Explicitly closing the file


#demonstrate use of with open()
import csv
with open(“student.csv”,”r”) as csv_fiel:
reader=csv.reader(csv_file)
 rows=[] #list to store the file data
for rec in reader: #copy data into the list ‘rows’
    rows.append(rec)
print(rows)


#Python program to count the no. of records present in “student.csv”file
import csv
f=open (“student.csv”,”r”)
csv_reader=csv.reader(f) #csv_reader is the csv reader object 
c=0
#reading the student file record by record
for row in csv_reader:
     c=c+1
print(“No. of records are:”, c)
f.close()#Explicitly closing the file


Writing to a CSV file

To write a CSV file in Python, we can use the csv.writer() function. 
The csv.writer() function returns a writer object that converts the user’s data into a delimited string. 
This string can later be used to write into CSV files using the writerow() function.
In order to write to a CSV file, we create a special type of object to write to the CSV file “writer object”, which is defined in the CSV module, and which we create using the writer() function.
The writerow() method allows us to write a list of fields to the file. 
The fields can be strings or numbers or both. Also, while using writerow(), we do not need to ass a new line character to indicate the end of the line; 
writerow() does it for you as necessary.

#program to write student data onto a csv file

#importing csv module
import csv
#field name
fields = [‘Name’, ‘Class’, ‘Year’, ‘Percent”]

#Data rows of csv file
rows =  [
[‘Rohit’,’XII’, ‘2003’, ‘92’]
[‘Shaurya’,’XI’, ‘2004’, ‘82’]
[‘Deep’,’XII’, ‘2002’, ‘80’]
[‘Prerna’,’XI’, ‘2006’, ‘85’]
[‘Lakshya’,’XII’, ‘2005’, ‘72’]
]

# name of csv file
filename = “marks.csv”
#writing to csv file
with open (filename, ‘w’, newline=’’) as f:
#by default, newline is  ‘\r\n’
#creating a csv writer object
csv_w = csv.writer(f, delimiter=’,’)
#writing the fields (the column heading) once
csv_w.writerows(fields)
for I in rows:
#writing the data row-wise
    csv_w.writerrow(i) 
print(“file created”)

writerow() is going to write the fields which are the column headings into the file and have to be written only once. 
Using for loop, rows are traversed from the list of rows from the file. 
writerow(i) is writing the data row-wise in the for loop and the end the file is automatically closed.
Also while giving csv.writer(), the delimiter taken is comma. 
We can change the delimiter whenever and wherever required by changing the argument passed to delimiter attribute.
For example, delimiter=’I”. you can put any character as delimiter and if nothing is given, comma is placed by default.
writrow() method is used to write each row.
We can avoid using for loop and can write all the rows/ records in one go.
This can be done by using writerows() method. 
writerows() writes all the rows in one go, so you need not use for loop and iterations,

Program to write data onto “student” csv file using writerows()

#program to write student data onto a csv file

#importing csv module
import csv
#field name
fields = [‘Name’, ‘Class’, ‘Year’, ‘Percent”]

#Data rows of csv file
rows =  [
[‘Rohit’,’XII’, ‘2003’, ‘92’]
[‘Shaurya’,’XI’, ‘2004’, ‘82’]
[‘Deep’,’XII’, ‘2002’, ‘80’]
[‘Prerna’,’XI’, ‘2006’, ‘85’]
[‘Lakshya’,’XII’, ‘2005’, ‘72’]
            ]

# name of csv file
filename = “newmarks.csv”

#writing to csv file
with open (filename, ‘w’, newline=’’) as f:
#by default, newline is  ‘\r\n’
#creating a csv writer object
csv_w = csv.writer(f, delimiter=’,’)
#writing the fields (the column heading) once
csv_w.writerows(fields)
#writing the rows all at once
csv_w.writerrows(rows) 
print(“all rows written in one go”)



************************
:CBSE BOARD EXAM QUESTIONS:


 Question 1 : Write a statement in Python to perform the following operations:
  • To open a text file "MYPET.TXT" in write mode
  • To open a text file "MYPET.TXT" in read mode
  Solution:
  • f1=open("MYPET.TXT",'w')
  • f2=open("MYPET.TXT",'r')

 Question 2 : 
Write a method in Python to write multiple lines of text contents into a text file "daynote.txt":

      Solution:
    def write1():
        f=open("daynote.txt",'w')
        while True:
            line=input("enter line:")
            f.write(line)
            choice=input("are there more lines (Y/N):")
             if choice=='N':
                break
         f.close
     

Question 3 : 
Write a user-defined function in Python that displays the number of lines starting with 'H' in the file "Para.txt". example , if the file contains
 
 Whose woods these are I think I know
  His house is in the village though
  He will not see me stopping here
  To watch his woods fill up with snow
  Then the line cout should be 2.

         Solution:
    def countH():
      f=open("Para.txt", "r")
      lines=0
      L=f.readlines()
      for i in L:
        if i[0]=='H':
            lines=lines+1
      print("No. of lines are:", lines)
      f.close()
          

Question 4 : 
Consider a binary file Employee.dat containing details such as  empno, Ename, salary (separator' :). write a Python function to display details of those employees who are earning between 20000 and 40000 (both values inclusive) 

         Solution:
    def Readfile():
    A= open("Employee.dat","rb+")
    B= A.readline()
    while(B):
      A = B.split(':')
      if (((float(A[2])>=20000) and ((float(A[2])<=40000)):
           print(B)
       B=A.readline()
     A.close()

       Question 5 : 
       Write a function in Python to count the no of lines in a text file "story.txt" which          are staring with the alphabet 'A'.
        Solution:
        
        def Countlines():
      f=open("Story.txt", "r")
      L=f.readlines()
      count=0
      for i in L:
        if i[0]=='A' or i[0]=='a'
            count=count+1
      print("No. of lines are:", count)
      f.close()



      **************
 






      







No comments:

Post a Comment