Whether directly or through using our portable devices and computers, we transfer and use tons of data every day. The size of that data is measured in different magnitudes of bytes, like gigabyte or terabyte. But how much information can you really store in one terabyte, for example, and what do bytes even stand for? We took a look at what bytes really are and how different magnitudes of bytes are measured.
What Is a Byte?
A byte is a unit of digital information and the smallest addressable amount of data in most computer architectures. While each byte consists of 8 bits, it’s easier for computer systems to use a whole batch of bits, as opposed to giving every single bit an address. One byte contains enough information to form every basic symbol for programming, which makes it the building block of any functioning piece of software. The symbol of a byte is “B”, which is not to be confused with a lowercase “b” that represents a bit.
Both units are still being for different purposes. In terms of usable information (i.e. stored data), a byte is the smallest amount of information that can be used individually. This is why splitting it down any further would just give 8 smaller units, that have to be arranged together anyways to build tangible information. This is why data storage and memory units usually use bytes to measure their capacity or the device’s ability to process, send, and receive data.
Meanwhile, data transfer speeds between different devices or systems are usually measured in bits per second. In this context, it doesn’t matter what batch of data a computer can process. What matters more here is the exact amount of zeroes and ones that get transferred over a cable, network, or wireless connection. Rounding that number up to the closest batch of eight bits causes an unnecessary error, so data transfer speeds of computer interfaces are usually measured in this, more precise manner.
Multiples of Bytes
Let’s talk about the different ways how thousands and millions of bytes are being measured. In almost all forms of measurement, each magnitude of 1000 smaller units has its own prefix. For example, 1000 grams get a prefix of kilo-, which gives us 1 kilogram. Below are different prefixes for bytes and the amount that they represent.
IEC decimal data prefix system
As we can see, each larger multiple of bytes contains 1000 units of the previous, smaller multiple. So now you can just memorize the names of the multiples and we can move on right? Well, not quite because there are actually two ways to address massive amounts of data, one of which is based on the decimal system. In short, the base of decimal counting is 10, meaning when you count ten numbers in a row, you put them together into one larger set. For example, when you count from 0 to 9, after 9 you “pack” the ten numbers into one number of a larger multitude, i.e. the number 1 in 10 means you’ve already counted 10 numbers. The same way ten 10s get turned into 100, and ten 100s get turned into 1000.
The other method of marking thousands and millions of bytes is based in the binary counting system – the ones and zeroes that computers work with. The base of this system is 2, meaning that you start counting at 0, go up to 1, and at 2 you already have to add to the next, larger multitude, i.e. the number two in binary will be 10. Since computers work with this counting system, it made sense at the time when huge amounts of data became a thing, to us multitudes of binary, not decimal numbers. So, the closest binary multitude to 103 (1000) was used, in order to not have a huge difference between the two systems, and it was conveniently 210 (1024). This allowed computer specialists to address storage and memory device specs with a handy 10000000000 (1024 in binary), instead of the clunky 1111101000 (1000 in binary).
Unfortunately, hardware manufacturers didn’t catch onto this counting system and kept on using the decimal system, which didn’t need a super-complex explanation, like the one you just read. The dispute between manufacturers and engineers caused the International Electrotechnical Commission (IEC) to intervene and give a more unified system for both sides to work with. Essentially, in 2002 they added a separate binary multitude counting system with special, cutesy binary system prefixes, to the decimal system and its prefixes that both manufacturers and engineers wanted to use in their own way.
IEC binary data prefix system
|Kibibyte, kiB||1024 bytes||1'024 (210) bytes|
|Mebibyte, MiB||1024 kibibytes||1'048'576 (220) bytes|
|Gibibyte, GiB||1024 mebibytes||1'073'741'824 (230) bytes|
|Tebibyte, TiB||1024 gibibytes||1'099'511'627'776 (240) bytes|
|Pebibyte, PiB||1024 tebibytes||1'125'899'906'842'624 (250) bytes|
|Exbibyte, EiB||1024 pebibytes||1'152'921'504'606'846'976 (260) bytes|
If you’ve never heard of a mebibyte before, don’t worry, you’re not alone. This prefix system didn’t catch on, at least not in consumer electronics, and both manufacturers and engineers just kept on using the decimal prefixes in their own way. This is why, for example, your hard drive capacity will always show up on your computer as less than advertised. This happens because the computer reads a drive’s capacity in binary but still addresses the amount with decimal prefixes, making it seem like there’s overall less capacity. The error between the two systems basically makes one legitimately decimal terabyte to equal around 0.91 “computer terabytes” (which should actually be called tebibytes). The technically wrong (according to IEC) prefix system that many computers use these days, is called JEDEC. You’ll most likely see the JEDEC system in most computers you encounter since most of them will run Windows OS. Mac and Linux seem to show decimal prefixes correctly – alongside decimal multitudes of bytes i.e. a 1TB drive will be around 1000GB.
JEDEC data prefix system
How Big are Gigabytes, Terabytes, and Petabytes, really?
You can finally take a break from trying to wrap your head around all this math, prefixes, and weird counting because now we’ll look at what the different multitudes of bytes actually amount to. One final technicality here – since they’re actually the unit that most computers use these days, we’ll be looking at the decimal-but-actually-binary JEDEC units (1TB=1024GB etc.).
The most used unit of data amount as of right now, a gigabyte allows us to easily evaluate the size of most forms of data we use:
- One DVD movie takes up around 1-4 GB, an HD movie takes up around 4-8 GB, Blu-ray movies take up around 20-25 GB, and 4k movie size is around 30-35 GB
- A modern triple-A game title, like GTA V or The Witcher 3 can take up to ~60 GB of storage space
- Most RAM memory units measure in at 4-16 GB of memory, which represents how much information it can process at one time
- Today, an average USB flash drive can hold 16-128 GB of data on average, with some models, like the Patriot Supersonic Mega, reaching up to 512 GB sizes
- 1 GB can hold around 200 high-quality, 3 minute MP3 files.
- 1GB is literally a truckload of data – a small truck filled with printed files would contain around 1GB of information.
A unit of information, that has gained a lot of popularity in the last few years, courtesy of the hard drive industry’s constant progress and innovations:
- Most hard drives and solid-state drives nowadays come in 1-4 TB capacities, with the biggest drive as of June 2017 being the Samsung PM1633a SSD at 16TB of storage
- If you wanted to store just 4TB of information in 3.5″ 1.44MB floppy disks and then stacked them, the 2.9 million disk tower would be almost as high as the Everest
- Filling up a one terabyte drive might be easier than you’d think – it would hold only around 20 large games and around 30 average length 4k movies.
Petabytes are used rarely as of right now since there are no amounts of data that big in our everyday lives. However, a lot of tech industries, that work with huge amounts of data work with petabytes of data daily. Here are some examples to help you understand just how big one petabyte is:
- Firstly, to put it into perspective, imagine a thousand 1TB drives in one place. Such a pile of drives would hold just one petabyte of data
- The human brain has been estimated to hold the equivalent of 2.5-3.8 PB of data
- The fastest supercomputer in the world – the Sunway TaihuLight can process roughly 1.31 PB of data at the same time and can store up to 20 PB of data
- As of 2013, Netflix used 3.14 PB of storage space for all their movies, while Facebook, if a 2014 report is anything to go by, held around 300 PB of data. These are very rough estimates, though, since every second people add huge amounts of data to their servers.
After seeing how much data a petabyte can hold, you’d think there’s no need to go any higher. Well, there are some things in this world already that can be estimated and measured in exabytes, which equal around a million terabytes. Here are a couple estimates that make use of exabytes:
- All words ever spoken by the human race are estimated to take up just around 5 exabytes
- It would take only 24 grams of DNA to store all of those words, because of how densely information can be stored in DNA. Researchers are working hard to find ways of reading and writing that data reasonably fast, though
- Putting one exabyte of data into 3.5″ floppy disks would help you build six 240’000 mile high towers, each of which would reach the Moon
- An entire exabyte of data was estimated to be created daily on the internet way back in 2012. Today, by the time you’d finish calculating an estimate, it would probably already be wrong.
Above exabytes are zettabytes (1024 exabytes) and yottabytes (1024 zettabytes), both of which aren’t really used to measure anything right now (aside from ungodly high floppy disk towers, of course). Since estimating anything in such huge values right now would be like measuring your height in lightyears, we’ll just wait until our technology gets even remotely close to producing such insane amounts of data.