Remove empty and duplicates lines in a file
- tinyytopic.com
- 0
- on Feb 16, 2023
How to remove empty and duplicate lines in a file?
The objective of this Python function is to read the contents of a text file, remove any empty lines and any duplicate lines, and then overwrite the file with the updated content that no longer contains empty lines or duplicate lines.
Ready-to-use Python function to empty or blank lines and duplicate lines in a file:
def remove_empty_duplicate_lines_text_file(txtFile): # Remove empty lines and duplicate lines from the file output = '' lines_seen = [] with open(txtFile, 'r', encoding="utf-8") as file: for line in file: if not line.isspace() and line.replace("\n", "") not in lines_seen: output+=line lines_seen.append(line.replace("\n", "")) if output[-1] == '\n': output = output[:-1] # remove newline character if exists at end of variable file = open(txtFile, 'w+', encoding="utf-8") file.write(output) file.close()
Write your main code as a sample below,
remove_empty_duplicate_lines_text_file("texts to remove blank and duplicate lines.txt")
The output of the code is (check the text file after executing the code),
Text file content before executing the function:
import requests
# Define headers
# Define the API endpoint
url = "https://api.kite.trade/instruments"
# Define headers
# Define headers
Text file content after executing the function:
import requests
# Define headers
# Define the API endpoint
url = "https://api.kite.trade/instruments"
How does the function work?
The Python function remove_empty_duplicate_lines_text_file
takes one argument, txtFile
, which is a string representing the path to a text file. The function’s goal is to read the contents of the file, remove any empty lines and any duplicate lines, and then overwrite the file with the updated content that no longer contains empty lines or duplicate lines.
Here is a breakdown of how the function works:
- The function initializes an empty string called
output
to hold the updated content of the text file. - The function also initializes an empty list called
lines_seen
to keep track of the lines that have already been seen. - The
with
statement is used to open the text file specified intxtFile
and create a file object calledfile
. - The
for
loop iterates over each line in the file. - The first
if
statement checks if the line is not a whitespace character, using theisspace()
method of the string. - The second
if
statement checks if the line, with the newline character removed using thereplace()
method, is not in thelines_seen
list. If the line is not empty and has not already been seen, it is appended to theoutput
string and added to thelines_seen
list. - After the loop has finished, the function checks if the last character of
output
is a newline character, and if so, removes it from theoutput
string. - The function reopens the file with write and read permissions using
open()
function with ‘w+’ argument. The encoding is specified as ‘utf-8’. - The updated content of the file, which is stored in the
output
string, is written to the file object using thewrite()
method. - Finally, the function closes the file object with the
close()
method.
In summary, this function takes a text file, reads its content, removes any empty lines and duplicate lines, and overwrites the file with the updated content that does not contain any empty lines or duplicate lines.