Data File Handling
Data Files
In any software application, we often need to store data permanently so that it can be reused later. In Python, we do this using data files. These files store information related to specific applications and can be categorized into two main types.
1. Types of Data Files
- Text Files
- Binary Files
1.1 Text Files
Text files store data as a sequence of readable characters, stream of ASCII or Unicode encoding (depending on your system settings).
Key Features of Text Files:
- Text is stored in a human-readable format.
- Each line is terminated by a special character, known as EOL (End Of Line).
- The EOL character depends on the operating system:
- \n → Linux/Unix (newline)
- \r\n → Windows (carriage return + newline)
- \r → Mac (older systems)
Python handles these characters automatically when reading from or writing to text files.
Types of Text Files:
1. Regular Text Files
- Stores plain text exactly as typed.
- New lines are created using newline characters
- Common extension: .txt
- Example Content:
I am a simple text2. Delimited Text Files
- Use a delimiter (such as a comma or tab) to separate values.
- Common file types:
- TSV: Tab Separated Values
- CSV: Comma Separated Values
- Extensions: .txt, .csv
- Example Formats:
TSV File:
One Two ThreeCSV File:
One,Two,ThreeCSV and TSV files are very commonly used for data exchange between software like Excel, databases, and Python programs.
| Type | Example Content |
|---|---|
| Regular Text File | I am a simple text |
| TSV File | I am simple text |
| CSV File | I,am,simple,text |
Other Text File Types
- .ini files (used for configuration)
- .rtf files (Rich Text Format)
1.2. Binary Files
Binary files store data in the form of raw bytes - not directly readable as text. These are used when speed, performance, or complex data structures are involved.
Key Features of Binary Files:
- Data is stored in byte format.
- More efficient for storing images, audio, video, and complex Python objects.
- Cannot be read as plain text in a text editor.
Common Binary File Extension: .dat
Technical Terms
- 1 Byte = 8 bits
- 1 Nibble = 4 bits
2. Opening and Closing Files in Python
Working with files is an essential part of many Python programs. Whether you want to store, update, or read data from a file, the first step is always the same: open the file. Once you're done, it's equally important to close the file properly.
Let's understand this topic in a simple, beginner-friendly way!
Opening a File in Python
To work with a file in Python, you must open it first using the built-in open() function. This function allows you to open a file in different modes based on what you want to do: read, write, or append data.
Syntax:
file_object = open("filename.txt") # Opens the file in read mode by default
file_object = open("filename.txt", "mode") # Opens the file in the specified modeExamples:
myfile = open("test.txt") # Opens in default read mode ('r')
myfile1 = open("test.txt", "r") # Opens in read mode
myfile2 = open("test.txt", "w") # Opens in write mode
myfile3 = open("test.txt", "a") # Opens in append mode
file1 = open("E:\\main\\result.txt", "w") # Windows path with double backslashesNote: A file object is also known as a file handle.
Raw String Example:
f1 = open(r"c:\temp\data.txt", "r")Here, r before the string makes it a raw string, which means special characters like \n or \t are not treated specially. Useful for Windows file paths!
File Access Modes in Python
Depending on what you want to do with the file, Python provides several access modes:
| Text Mode | Binary Mode | Description | Notes |
|---|---|---|---|
| 'r' | 'rb' | Read only | File must exist, or you'll get an error |
| 'w' | 'wb' | Write only | Creates a new file if it doesn't exist. If it does, existing data is erased |
| 'a' | 'ab' | Append only | Appends data to file. Creates file if it doesn't exist |
| 'r+' | 'rb+' | Read & Write | File must exist, supports both reading and writing |
| 'w+' | 'wb+' | Write & Read | Creates a new file or truncates existing one |
| 'a+' | 'ab+' | Append & Read | Appends to file if it exists, or creates new one |
Tip
Adding 'b' to a mode (like 'rb', 'wb') opens the file in binary mode, useful for non-text data like images, videos, etc.
Summary Table
| Task | Mode |
|---|---|
| Reading a file | 'r' or 'rb' |
| Writing new data (and removing old content) | 'w' or 'wb' |
| Adding new data to the end of the file | 'a' or 'ab' |
| Reading and writing without removing data | 'r+' or 'rb+' |
| Writing and reading with content replaced | 'w+' or 'wb+' |
| Reading existing data and appending new data | 'a+' or 'ab+' |
Close a File
Once you're done working with a file, it's very important to close it using the close() method. Although Python will usually close files automatically when a program ends, it's a best practice to close them manually. This ensures that all resources are properly freed and no data is lost.
Syntax:
file_object.close()Example:
f1 = open("example.txt", "r")
# read or write operations
f1.close() # Closing the fileText File in Python
What is a Text File?
A text file is a simple file that contains human-readable characters. Each line in a text file ends with a special character called EOL (End of Line). In Python, the default EOL character is \n (newline).
Reading from Text Files in Python
To read data from a text file, Python provides three useful methods:
| S.No | Method | Syntax | Description |
|---|---|---|---|
| 1. | read() | f.read([n]) | Reads entire file or up to n bytes. |
| 2. | readline() | f.readline([n]) | Reads one line at a time. Optionally, up to n bytes |
| 3. | readlines() | f.readlines() | Reads all lines and returns them as a list of strings. |
Example 1: Using read()
f1 = open("test.txt")
data = f1.read(15)
print(data)
f1.close()Output:
12345678PythonExample 2: Using readline()
f1 = open("test.txt")
line = f1.readline(11)
print(line)
f1.close()Output:
Python is aExample 3: Using readlines()
f1 = open("test.txt")
lines = f1.readlines()
print(lines)
f1.close()Output:
['Line 1\\n', 'Line 2\\n', 'Line 3\\n']Writing to Text Files
Python gives you two main ways to write into a file:
| S.No | Method | Syntax | Description |
|---|---|---|---|
| 1. | write() | f.write(str1) | Writes a single string into the file. |
| 2. | writelines() | f.writelines(list_of_strs) | Writes a list of strings into the file. |
Example 1: Using write()
f1 = open("test.txt", 'w')
name = input("Enter your name: ")
f1.write(name)
f1.close()Example 2: Using writelines()
f1 = open("test.txt", 'w')
names = []
for i in range(3):
name = input("Enter name: ")
names.append(name + '\n')
f1.writelines(names)
f1.close()Appending to a File
If you use 'w' mode, old content will be erased. To add new content without deleting existing data, use append mode ('a'):
f = open("test.txt", 'a')
f.write("New line added\n")
f.close()You can also use 'r+' or 'a+' for reading and writing at the same time.
Cleaning Data: Using strip()
Sometimes, you may want to remove extra spaces or newline characters.
Useful Functions to Remove Whitespace
| Function | What it does |
|---|---|
| strip() | Removes unwanted characters from both ends of the string |
| lstrip() | Removes characters from the left side only (leading whitespace) |
| rstrip() | Removes characters from the right side only (trailing whitespace like \n) |
Let's Understand with Examples
1. Removing the newline character \n from a file line
f = open("poem.txt", 'r')
line = f.readline()
line = line.rstrip('\n') # Removes the newline character at the endWhy use rstrip('\\n')?
Because when you use readline(), it includes the \n at the end of the line. We remove it to avoid extra blank lines in output.
2. Removing leading spaces from a line
f = open("poem.txt", 'r')
line = f.readline()
line = line.lstrip() # Removes spaces from the beginning of the lineUse case: If your text has unnecessary spaces before the actual content, lstrip() helps to clean it.
Understanding the Role of File Pointer in File Handling
In Python, when we open a file for reading or writing, a file pointer is automatically created.
What is a File Pointer?
- A file pointer is like a marker or cursor.
- It tells where the next read or write operation will happen in the file.
Why is the File Pointer Important?
- Whenever we:
- Read data from a file
- Write data into a file
...Python uses the file pointer to know where to start the operation.
After the operation:
- The action happens from the file pointer's position.
- The file pointer then moves ahead automatically by the number of bytes read or written.
Let's Understand with an Example
f = open("Mark.txt", "r")- This command opens the file in read mode.
- The file pointer is placed at the beginning of the file.
python programming LanguageExample 1: Reading 1 Byte
ch = f.read(1)This reads 1 byte from the file — from the current pointer position.
Now:
- ch will store: 'p'
- The file pointer moves forward by 1 byte and now points to 'y'
Example 2: Reading 2 More Bytes
str1 = f.read(2)This reads the next 2 bytes.
Now:
- str1 will store: 'yt'
- The file pointer moves forward again, now pointing to 'h'
File content: p y t h o n p r o g r a m
- Pointer position: ↑
- After f.read(1): ↑
- After f.read(2): ↑
Standard Input, Output, and Error Streams in Python
- Standard Input - to take input from the keyboard
- Standard Output - to show normal results on the screen
- Standard Error - to show error messages
| Stream | Meaning | Default Device |
|---|---|---|
| stdin | Standard input stream | Keyboard |
| stdout | Standard output stream | Monitor/Display |
| stderr | Standard error output stream | Monitor (for errors) |
Module Used: sys
import sysFunctions from sys Module
| Function | Description |
|---|---|
| sys.stdin.read() | Reads input from keyboard |
| sys.stdout.write() | Sends output to the display (like print()) |
| sys.stderr.write() | Sends error message to screen |
Example:
import sys
f = open("test.txt")
line1 = f.readline()
line2 = f.readline()
sys.stdout.write(line1)
sys.stdout.write(line2)
sys.stderr.write("No Errors occurred\n")Understanding Absolute and Relative Paths
In Python (and general computing), file paths help you locate and access files or folders. There are two main types of paths:
1. Absolute Path
An absolute path gives the complete location of a file or folder starting from the root directory.
It shows the full hierarchy — all the folders and subfolders — needed to reach a file.
Format:
DriveName:\Folder1\Folder2\...\Filename.ExtensionExamples:
- E:\PROJECT\a\test.txt - File named test.txt inside folder a in drive E.
- D:\SALES - Folder named SALES in drive D.
Key Points:
- The \ symbol represents folder levels and separates directories.
- An absolute path always starts from the root folder or drive.
- The full name of a file or folder including its location is called a pathname.
2. Relative Path
A relative path gives the location of a file relative to the current working directory (the folder you're currently in).
You can use:
- . (single dot) → refers to current folder
- .. (double dots) → refers to parent folder (one level up)
Examples:
- .\Two.doc → File Two.doc in the current folder
- ..\two.txt → File two.txt in the parent folder
- ..\project\report.dat → File report.dat inside project folder, which is in the parent folder
Relative paths are shorter and more flexible, especially when your code or project is being moved or shared with others.
Working with Binary Files in Python
Binary files are different from regular text files. They store data in the form of bytes, making them ideal for saving complex data like images, audio, and even Python objects.
What is a Binary File?
A binary file stores data as a stream of bytes — not as plain text.
Key Features:
- Not human-readable (you will see strange symbols if you try).
- Meant for storing non-textual or complex data types.
- Requires specific software or Python modules to read/write.
- Usually smaller in size than text files.
- Good for storing Python objects like dictionaries, lists, etc.
Pickling and Unpickling (Serialization and Deserialization)
To store Python objects in a binary file, we use pickling.
To read them back, we use unpickling.
- Pickling = Python Object → Bytes (for writing)
- Unpickling = Bytes → Python Object (for reading)
How to Use Pickle in Python?
Step-by-Step:
- Import the module: import pickle
- Open a file in binary mode ('wb' for write, 'rb' for read).
- Write or read objects using pickle.dump() or pickle.load().
- Close the file.
Creating & Closing a Binary File
file = open("student.dat", "wb+") # 'wb+' = write + read binary
file.close()Writing Data to a Binary File (Pickling)
import pickle
emp1 = {'empno': 1201, 'Name': 'Zoya', 'Age': 25}
empfile = open('Emp.dat', 'wb')
pickle.dump(emp1, empfile) # write object into file
print("File Created")
empfile.close()This stores the dictionary emp1 into a binary file named Emp.dat.
Reading Data from a Binary File (Unpickling)
import pickle
empfile = open("Emp.dat", "rb")
try:
while True:
emp = pickle.load(empfile)
print(emp)
except EOFError:
empfile.close()This reads all objects from the binary file until end of file (EOF).
Updating a Binary File
Updating means modifying the data inside the file.
3-Step Process:
- Search for the record you want to update.
- Edit the object in memory.
- Write it back into the file — usually in a temporary file, and then replace the old one.
| Action | Function/Method | File Mode |
|---|---|---|
| Write (object → file) | pickle.dump(obj, file) | 'wb' or 'wb+' |
| Read (file → object) | pickle.load(file) | 'rb' or 'rb+' |
Random Access in Files
Random Access allows you to move the file pointer to any position in a file to read or write data. This is helpful when you want to jump to a specific part of the file, instead of reading everything from start to end.
Functions Used:
- tell() → To know the current position of the file pointer.
- seek() → To move the file pointer to a particular location.
tell() Function
Returns the current position of the file pointer in bytes.
Syntax:
file_object.tell()fh = open("text.txt", "r")
print("Initially file pointer is at:", fh.tell())
print("3 bytes read are:", fh.read(3))
print("Now file pointer is at:", fh.tell())Sample Output:
Initially file pointer is at: 0
3 bytes read are: Mar
Now file pointer is at: 3seek() Function
Moves the file pointer to a specific position in the file.
Syntax:
file_object.seek(offset, [mode])Parameters:
- offset → number of bytes to move
- mode → from where to start (optional, default is 0)
Mode values:
- 0 - from the beginning of the file (default)
- 1 - from the current position
- 2 - from the end of the file
Example:
fh = open("text.txt", "r")
# Moves to 30th byte from the beginning
fh.seek(30)
# Moves 30 bytes forward from current position
fh.seek(30, 1)
# Moves 30 bytes backward from end of file
fh.seek(-30, 2)Important Notes:
- You cannot move backward from the beginning of the file (BOF).
- You cannot move forward from the end of the file (EOF).
Working with CSV Files in Python
What is a CSV File?
CSV stands for Comma Separated Values. It is a text file where:
- Data is stored in tabular format (like rows and columns in Excel).
- Each value is separated by a comma.
- Each line in the file represents a record.
- Each record can have one or more fields (columns).
Why Use CSV Files?
- Easy to create and understand
- Used for exporting and importing data from spreadsheets or databases
- Can store large amounts of data
- Supported by many applications like Excel, MySQL, etc.
Python's csv Module
Python has a built-in module called csv that helps in reading and writing CSV files easily.
Opening and Closing CSV Files
Open a CSV file like a normal text file:
file1 = open("student.csv", "w") # write mode
file1 = open("student.csv", "r") # read modeCSV files are created automatically when opened in:
- w, w+, a, a+ mode if they don't already exist
- Modes w and w+ overwrite existing files
- Modes a and a+ append data to existing files
newline='' Argument
When opening a CSV file, always use:
file1 = open("student.csv", "w", newline='')This prevents unwanted extra blank lines in Windows systems.
Different OS store newline characters differently:
| Symbol | Meaning | OS |
|---|---|---|
| \r | Carriage Return | macOS (old) |
| \n | Line Feed | UNIX/Linux |
| \r\n | Carriage Return + LF | Windows |
Writing to CSV Files
Steps:
- Import csv module
- Open file with newline=''
- Create writer object
writer = csv.writer(file_handle)(or use a different delimiter like '|')
writer = csv.writer(file_handle, delimiter='|')4. Prepare data (list or tuple)
sturec = (11, 'Neelam', 79.0)5. Use:
- writer.writerow(data) → writes one row
- writer.writerows(list_of_rows) → writes multiple rows
Example:
import csv
file = open("student.csv", "w", newline='')
writer = csv.writer(file)
writer.writerow(["Roll", "Name", "Marks"])
writer.writerow([1, "Anu", 85])
writer.writerow([2, "Ravi", 92])
file.close()Reading from CSV Files
Steps:
- import csv
- Open file in "r" mode
- Create reader object
reader = csv.reader(file_handle)(or use delimiter if needed)
reader = csv.reader(file_handle, delimiter="|")4. Use for loop to fetch each row
5. Close the file
Example:
import csv
file = open("student.csv", "r", newline='')
reader = csv.reader(file)
for row in reader:
print(row)
file.close()Using with open (Recommended)
With block automatically closes the file.
Example:
import csv
with open("student.csv", "r", newline='') as file:
reader = csv.reader(file)
for row in reader:
print(row)Practice Questions
- Write a Python program to create a text file and write your name and age into it.
- Write a Python program to read the contents of a text file and display them on the screen.
- Write a Python program to count the number of lines in a text file.
- Write a Python program to create a binary file to store student records (roll number, name, marks) using pickle.
- Write a Python program to read and display all records from a binary file created in the previous question.
- Write a Python program to create a CSV file with student data and then read it back.
- Write a Python program to append new data to an existing text file.
- Write a Python program to copy the contents of one file to another file.
Related Resources
Need Help?
Join our tuition classes for personalized guidance and doubt clearing.
Register for Classes →