Skip to content

a. What You Need To Know

Key Knowledge - KK 3.1.6

Characteristics of data sources (plain text (TXT), delimited (CSV) and XML files), including:

  • Structure
  • Reasons for use

Files let a program save information to be retrieved later, which is why they are used in many software solutions.

Text files vs binary files

Key Point

Text files store data as readable plain text

Binary files store data in binary form (e.g. images/sound) and are not easily readable

In Software Development, you focus on text files.


Plain text (TXT) files

Structure

Key Point

A plain text (TXT) file contains characters of readable data and is read as character/string data types.

In practice, it's usually:

  • One value per line, or
  • A simple pattern (e.g. key=value)

Reasons for use

TXT files are commonly used for:

  • configuration settings
  • storing small amounts of data in simple programs

Note

Even though they can be opened by humans, they're typically designed for fast processing by programs, so they often lack headings/comments that help humans.


Delimited files (CSV)

Structure

Key Point

A delimited file stores values separated by a programmer-selected character called a delimiter. Common delimiters include commas, tabs and colons.

If the delimiter is a comma → it's a CSV (comma-separated values) file

Delimited files allow storage of two-dimensional arrays in a structured, readable format (rows/columns).

Reasons for use

CSV is useful when:

  • Data is tabular (rows/columns)
  • You want a simple format that is easy to move between tools (e.g. spreadsheets ↔ programs)
  • Data structure is consistent (same fields each row)

Examiner-style advantages

Examiner-style advantages commonly credited include:

  • Smaller file size → faster transmission
  • Easier to set up and program

XML files

Structure

Key Point

XML uses tags to describe and separate data, and is structured like a tree:

  • a root element
  • parent elements
  • child elements

XML may also include a prolog (before the document contents) containing details like XML version and character encoding.

Reasons for use

XML is useful when:

  • Data is hierarchical (nested)
  • You need self-describing data (tags explain what fields mean)
  • You expect the structure might evolve (more flexible than strict row/column)

Examiner-style advantages

Examiner-style advantages commonly credited include:

  • Extensible (can accommodate structural changes)
  • Often easier for humans and computers to read
  • Can handle commas/line breaks without breaking field meaning (compared to naive CSV parsing)

CSV vs XML (the comparison you must be able to write)

Important

A high-scoring comparison typically:

  • describes how each is structured, and
  • explains why one is better for the scenario (not just "XML is better")

Examiners reward explicit structure statements like:

  • CSV uses delimiters to separate fields
  • XML uses tags that describe what each field contains