Build a Command-Line Text Analyzer in Python - Step by Step

July 20, 2025 · 6 min read

Founder @ Dewiride

Build a Command-Line Text Analyzer in Python that reads a text file, counts lines or words, displays frequent words, detect long lines.

Introduction

Whether you're a developer, writer, or student, analyzing text files can help uncover patterns and insights fast. Instead of doing it manually, why not build your own command-line text analyzer in Python?

In this guide, we'll build a tool that:

Reads a text file
Counts lines and words
Displays frequent words
Detects long lines

Let's dive in.

Prerequisites

Python installed on your machine (3.6+ recommended).
Create a new folder to keep the project related files in it. I'll name it as command-line-text-analyzer.
Open the above newly created folder in Visual Studio Code (VSCode).
Create a new file named text-analyzer.py.

Project Structure in VS Code

Step 1: Accept Command-Line Arguments

To analyze any file from the terminal, we need to accept its path and an optional max line length. We'll use argparse for this.

text-analyzer.py
import argparse

def main():
    parser = argparse.ArgumentParser(description="Command-line Text Analyzer")
    parser.add_argument("file", help="Path to the text file")
    parser.add_argument("--length", type=int, default=80, help="Max line length to check for long lines")
    args = parser.parse_args()

    print(f"Analyzing file: {args.file}")
    print(f"Long line threshold: {args.length} characters")

if __name__ == "__main__":
    main()

Explanation:

argparse.ArgumentParser() lets us define command-line arguments.
file is required; --length is optional (defaults to 80).

Run this with:

Terminal
python text-analyzer.py dummy_text_file.txt --length 100

Create a sample text files for testing. I am using dummy_text_file.txt with some random text and long_story.txt with multiple paragraphs.

Sample Text Files

📄Download Sample TXT Files Zip

Step 2: Read the File Safely

Let’s read the file contents into memory and handle missing file errors gracefully.

text-analyzer.py
def read_file(filepath):
    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            return f.readlines()
    except FileNotFoundError:
        print(f"Error: File not found - {filepath}")
        return []

Explanation:

This reads all lines into a list.
It uses UTF-8 encoding for compatibility.
If the file doesn’t exist, it prints an error and returns an empty list.

Update main() to use it:

text-analyzer.py
lines = read_file(args.file)
if not lines:
    return

Step 3: Count Lines

This one’s easy! Just use len() on the list of lines.

text-analyzer.py
def count_lines(lines):
    return len(lines)

In main():

text-analyzer.py
line_count = count_lines(lines)
print(f"Total lines: {line_count}")

Step 4: Count Words

We want to count all words, stripping punctuation so “hello!” becomes just “hello”.

text-analyzer.py
import string

def count_words(lines):
    words = []
    for line in lines:
        clean_line = line.translate(str.maketrans('', '', string.punctuation))
        words.extend(clean_line.strip().split())
    return len(words), words

Explanation:

str.maketrans('', '', string.punctuation) removes all punctuation.
split() breaks the line into words.
We return:
- total word count
- list of all words (for frequency analysis)

Step 5: Analyze Word Frequency

We’ll use collections.Counter to tally up word counts.

text-analyzer.py
from collections import Counter

def word_frequencies(words):
    return Counter(word.lower() for word in words)

Explanation:

Converts all words to lowercase (so “Python” and “python” are the same).
Returns a dictionary-like object with counts.

To show the top 10 words:

text-analyzer.py
for word, count in freqs.most_common(10):
    print(f"{word}: {count}")

Step 6: Detect Long Lines

Now let’s find lines that are longer than the user-specified threshold.

text-analyzer.py
def detect_long_lines(lines, max_length=80):
    return [i + 1 for i, line in enumerate(lines) if len(line) > max_length]

Explanation:

We return line numbers (1-based) for each line that’s too long.
enumerate() gives us both index and content.

Step 7: Display Everything Nicely

Let’s wrap up by printing all results cleanly.

text-analyzer.py
def display_results(file_path, lines, word_count, freqs, long_lines):
    print(f"\nAnalysis of: {file_path}")
    print(f"Total Lines: {len(lines)}")
    print(f"Total Words: {word_count}")
    
    print("\nTop 10 Frequent Words:")
    for word, count in freqs.most_common(10):
        print(f"  {word}: {count}")
    
    if long_lines:
        print(f"\nLines longer than threshold: {len(long_lines)}")
        print("Line numbers:", long_lines)
    else:
        print("\nNo lines exceed the specified length.")

Then update main():

text-analyzer.py
word_count, words = count_words(lines)
freqs = word_frequencies(words)
long_lines = detect_long_lines(lines, args.length)

display_results(args.file, lines, word_count, freqs, long_lines)

Final Code (All Together)

You can now paste everything into a file named text_analyzer.py.

text-analyzer.py
import argparse
from collections import Counter
import string

def read_file(filepath):
    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            return f.readlines()
    except FileNotFoundError:
        print(f"File not found: {filepath}")
        return []

def count_lines(lines):
    return len(lines)

def count_words(lines):
    words = []
    for line in lines:
        line = line.translate(str.maketrans('', '', string.punctuation))
        words.extend(line.strip().split())
    return len(words), words

def word_frequencies(words):
    return Counter(word.lower() for word in words)

def detect_long_lines(lines, max_length=80):
    return [i + 1 for i, line in enumerate(lines) if len(line) > max_length]

def display_results(file_path, lines, word_count, freqs, long_lines):
    print(f"\nAnalysis of: {file_path}")
    print(f"Total Lines: {len(lines)}")
    print(f"Total Words: {word_count}")
    print("\nTop 10 Frequent Words:")
    for word, count in freqs.most_common(10):
        print(f"  {word}: {count}")
    if long_lines:
        print(f"\nLines longer than threshold: {len(long_lines)}")
        print("Line numbers:", long_lines)
    else:
        print("\nNo lines exceed the specified length.")

def main():
    parser = argparse.ArgumentParser(description="Command-line Text Analyzer")
    parser.add_argument("file", help="Path to the text file")
    parser.add_argument("--length", type=int, default=80, help="Max line length to check for long lines")
    args = parser.parse_args()

    lines = read_file(args.file)
    if not lines:
        return

    line_count = count_lines(lines)
    word_count, words = count_words(lines)
    freqs = word_frequencies(words)
    long_lines = detect_long_lines(lines, args.length)

    display_results(args.file, lines, word_count, freqs, long_lines)

if __name__ == "__main__":
    main()

Testing the Analyzer

Terminal
python text-analyzer.py dummy_text_file.txt
python text-analyzer.py dummy_text_file.txt --length 100
python text-analyzer.py long_story.txt
python text-analyzer.py long_story.txt --length 50

Sample Output

Sample Output 2

Sample Output 3

Sample Output 4

Conclusion

Congratulations! You’ve built a simple yet powerful command-line text analyzer in Python. This tool can help you quickly analyze text files, count words, and detect long lines.

Introduction​

Prerequisites​

Step 1: Accept Command-Line Arguments​

Step 2: Read the File Safely​

Step 3: Count Lines​

Step 4: Count Words​

Step 5: Analyze Word Frequency​

Step 6: Detect Long Lines​

Step 7: Display Everything Nicely​

Final Code (All Together)​

Testing the Analyzer​

Conclusion​

Stay Updated

Introduction

Prerequisites

Step 1: Accept Command-Line Arguments

Step 2: Read the File Safely

Step 3: Count Lines

Step 4: Count Words

Step 5: Analyze Word Frequency

Step 6: Detect Long Lines

Step 7: Display Everything Nicely

Final Code (All Together)

Testing the Analyzer

Conclusion