OCR GCSE Computer Science (J277) - Topic 1.2.4
Quick Recall - File Sizes:
Think about different types of files. Which generally take up more storage space?
By the end of this lesson, you should be able to:
Digital files, especially images, audio, and video, can be very large. CompressionA method or protocol for using fewer bits to represent the original information. is the process of reducing the size of a data file.
There are many reasons to compress dataDownload/buffering times decrease. Smaller files = fewer packets = faster transmission time. Reduces traffic over the Internet. Less chance of collisions or transmission errors. Data allowances do not run out as quickly = Saves money. Voice can be transmitted fast enough to keep up with speech in a video. Saves spending more money on data storage e.g. hard drives, cloud etc. Faster to back-up data., including saving storage space and reducing the time it takes to transfer files over networks (like the internet).
There are two main types of compression: lossy compressionReduction of file size by removing certain, redundant information from the file. The eliminated data is unrecoverable. Tries to recreate a file without the omitted data. Much smaller file sizes but there will be some loss of quality. (which removes some data permanently) and lossless compressionEvery bit of the original data can be recovered from the compressed file. The uncompressed image will be the same as the original with no loss of data. Works by looking for patterns in the data. Larger compressed file sizes than lossy. e.g. Run-length encoding. (which allows the original data to be perfectly reconstructed).
This lesson explores how these methods work and when they are used. Hover over keywordsThis is an example tooltip! for definitions.
Describe ways we could reduce the file size of a digital image using lossy compressionReduction of file size by removing certain, redundant information from the file. The eliminated data is unrecoverable. Tries to recreate a file without the omitted data. Much smaller file sizes but there will be some loss of quality. techniques. [3 marks]
(Max 3 marks)
For each use case below, select whether lossy compressionReduction of file size by removing certain, redundant information from the file. The eliminated data is unrecoverable. Tries to recreate a file without the omitted data. Much smaller file sizes but there will be some loss of quality. or lossless compressionEvery bit of the original data can be recovered from the compressed file. The uncompressed image will be the same as the original with no loss of data. Works by looking for patterns in the data. Larger compressed file sizes than lossy. e.g. Run-length encoding. would be more appropriate and briefly explain why.
Use Case | Lossy / Lossless | Reason |
---|---|---|
A music track for streaming | ||
A text document (e.g., essay) | ||
A video for online streaming | ||
An image for a website | ||
A Python program file | ||
A database file |
A simple lossless technique uses a dictionary (or index) to replace repeated words/phrases with shorter codes. Consider this text:
Hickory, dickory, dock, The mouse ran up the clock. The clock struck one, The mouse ran down. Hickory, dickory, dock.
Using the dictionary provided, observe the simulation of Line 1 being decoded, then enter the codes for Line 2.
Code | Text | Code | Text |
---|---|---|---|
0 | End of Line | 8 | ran_ |
1 | H | 9 | up_ |
2 | ickory,_ | 10 | the_ |
3 | d | 11 | cl |
4 | ock | 12 | . |
5 | , | 13 | _struck_ |
6 | The_ | 14 | one, |
7 | mouse_ | 15 | down. |
Note: '_' represents a space character.
Now, consider the entire compressed text sequence (all 5 lines).
1 2 3 2 3 4 5 0 6 7 8 9 10 11 4 12 0 10 11 4 13 14 0 6 7 8 15 0 1 2 3 2 3 4 12
((Original Size - Compressed Size) / Original Size) * 100
Click the link below to open a separate page where you can calculate download times for different audio file qualities and connection speeds, and consider the advantages of audio compression.
Open Download Time Calculator TaskCompression is essential for the modern internet and digital media:
Myth 1: "Lossless compression doesn't make files much smaller."
Reality: While lossless compressionEvery bit of the original data can be recovered from the compressed file. The uncompressed image will be the same as the original with no loss of data. Works by looking for patterns in the data. Larger compressed file sizes than lossy. e.g. Run-length encoding. typically achieves lower compression ratios than lossy, it can still significantly reduce file size, especially for data with repeating patterns (like text or simple graphics). The reduction depends heavily on the data itself.
Myth 2: "Lossy compression completely ruins the quality of images/audio."
Reality: Good lossy compressionReduction of file size by removing certain, redundant information from the file. The eliminated data is unrecoverable. Tries to recreate a file without the omitted data. Much smaller file sizes but there will be some loss of quality. algorithms are designed to remove data that humans are least likely to notice (perceptual music shapingRefers to the process of removing inaudible sounds in order to make a file size smaller. e.g. Noises at frequencies that humans cannot hear. Quiet sounds that cannot be heard over louder sounds.). While high levels of compression *can* cause noticeable artifacts, moderate levels often provide a good balance between file size and perceived quality.
Myth 3: "You should always use lossless compression for everything."
Reality: Lossless is essential when *every single bit* matters (e.g., program code, text documents, medical scans). However, for media like photos, music, and video intended for viewing/listening, the much smaller file sizes offered by lossy compression are often more practical for storage and transmission, even with a slight, often imperceptible, quality reduction.
Myth 4: "Compressing a file multiple times with lossy compression is okay."
Reality: Re-compressing an already lossy-compressed file (e.g., saving a JPEG as a JPEG again with quality settings) causes further data loss and quality degradation each time. It's best to work with original or losslessly compressed files if further editing and saving are needed.
1. Describe 3 reasons for compressing data. [3 marks]
(1 mark for each valid reason, max 3 marks)
2. How does Run Length Encoding (RLE)Lossless compression technique that summarises consecutive patterns of the same data. Works well with image and sound data where data could be repeated many times. work for sound data compression? [2 marks]
(Max 2 marks)
3. Explain why lossy compressionReduction of file size by removing certain, redundant information from the file. The eliminated data is unrecoverable. Tries to recreate a file without the omitted data. Much smaller file sizes but there will be some loss of quality. cannot be used for program code. [2 marks]
(Max 2 marks)
Another common lossless compression technique is Huffman Coding. It assigns shorter binary codes to more frequent characters/symbols and longer codes to less frequent ones.
Research Task: How does Huffman Coding create its variable-length codes? Why is it considered a lossless technique?
Video compression is complex, often using lossy techniques. It involves compressing individual frames (intra-frame) and also only storing the differences between consecutive frames (inter-frame).
Research Task: What is a video codec (e.g., H.264, H.265/HEVC, AV1)? Briefly explain the difference between intra-frame and inter-frame compression in video.
You now understand the basics of lossy and lossless data compression!
This concludes the main topics for Memory & Storage. You might want to review: