Remove empty and duplicates lines in a file
- tinyytopic.com
- 0
- on Feb 16, 2023
How to remove empty and duplicate lines in a file?
The objective of this Python function is to read the contents of a text file, remove any empty lines and any duplicate lines, and then overwrite the file with the updated content that no longer contains empty lines or duplicate lines.
Ready-to-use Python function to empty or blank lines and duplicate lines in a file:
def remove_empty_duplicate_lines_text_file(txtFile):
# Remove empty lines and duplicate lines from the file
output = ''
lines_seen = []
with open(txtFile, 'r', encoding="utf-8") as file:
for line in file:
if not line.isspace() and line.replace("\n", "") not in lines_seen:
output+=line
lines_seen.append(line.replace("\n", ""))
if output[-1] == '\n': output = output[:-1] # remove newline character if exists at end of variable
file = open(txtFile, 'w+', encoding="utf-8")
file.write(output)
file.close()
Write your main code as a sample below,
remove_empty_duplicate_lines_text_file("texts to remove blank and duplicate lines.txt")
The output of the code is (check the text file after executing the code),
Text file content before executing the function:
import requests
# Define headers
# Define the API endpoint
url = "https://api.kite.trade/instruments"
# Define headers
# Define headers
Text file content after executing the function:
import requests
# Define headers
# Define the API endpoint
url = "https://api.kite.trade/instruments"
How does the function work?
The Python function remove_empty_duplicate_lines_text_file takes one argument, txtFile, which is a string representing the path to a text file. The function’s goal is to read the contents of the file, remove any empty lines and any duplicate lines, and then overwrite the file with the updated content that no longer contains empty lines or duplicate lines.
Here is a breakdown of how the function works:
- The function initializes an empty string called
outputto hold the updated content of the text file. - The function also initializes an empty list called
lines_seento keep track of the lines that have already been seen. - The
withstatement is used to open the text file specified intxtFileand create a file object calledfile. - The
forloop iterates over each line in the file. - The first
ifstatement checks if the line is not a whitespace character, using theisspace()method of the string. - The second
ifstatement checks if the line, with the newline character removed using thereplace()method, is not in thelines_seenlist. If the line is not empty and has not already been seen, it is appended to theoutputstring and added to thelines_seenlist. - After the loop has finished, the function checks if the last character of
outputis a newline character, and if so, removes it from theoutputstring. - The function reopens the file with write and read permissions using
open()function with ‘w+’ argument. The encoding is specified as ‘utf-8’. - The updated content of the file, which is stored in the
outputstring, is written to the file object using thewrite()method. - Finally, the function closes the file object with the
close()method.
In summary, this function takes a text file, reads its content, removes any empty lines and duplicate lines, and overwrites the file with the updated content that does not contain any empty lines or duplicate lines.