# PDF pages per gigabyte?



## dstebbins

When I say "Gigabyte," I mean 1,000,000,000 bytes, not 1,073,741,824 bytes.  The gigabyte I'm thinking of will show up on a computer as 0.93 gigabytes.

Provided that each individual PDF page is as full as it could possibly be (e.g. there is no blank parts on any page to signify the end of chapters or whatnot), how many pages of PDF document can be stored inside one gigabyte of storage, such as a 1GB flash drive?


----------



## Cromewell

I have no idea, I have a 877 page pdf that's 6.8MB if you'll take anecdotal evidence


----------



## Geoff

I don't believe it's possible, as PDF's have different levels of compression.  You can convert a 10MB color PDF into one that takes up say, 200KB without any real noticeable difference.


----------



## tlarkin

It depends on if it is all text, or pictures, and what size and quality.


----------



## dstebbins

[-0MEGA-];1479942 said:
			
		

> I don't believe it's possible, as PDF's have different levels of compression.  You can convert a 10MB color PDF into one that takes up say, 200KB without any real noticeable difference.



So, are you saying that we could store all of the US federal government's archived records on a single 4GB flash drive?

See, per the Freedom of Information Act, the government has to keep records of almost all of its proceedings.  For example (and, btw, I'm making this example up), Officer James Smith of St. Louis, MO arrested Donald Jones for possession of 1.5oz of marijuana, and he did so at the address of 1515 Pioneer Rd, St. Louis, MO 63102 at the time of 7:14PM on January 5, 1990.  The court must keep records that Donald Jones was convicted of possession of 1.5oz of marijuana on March 8, 1990 and sentenced to 18 months in prison.  The prison must keep records that Donald Jones was released on parole after on June 9, 1991.

In the year 3000, anyone can look that up, and it should still be there, per the FOIA.

Every level of government must keep record of every single thing that they do like that.

Are you saying that we could compress all that information into a single flash drive?


----------



## Cromewell

He is saying that 2 comparable PDFs (ie same number of pages, roughtly the same content) may have very different sizes. If you were to gather up all the data you can access by the Freedom of Information Act and convert it all to PDF I'm confident that it would not fit on a 4GB flash drive.


----------



## dstebbins

Cromewell said:


> He is saying that 2 comparable PDFs (ie same number of pages, roughtly the same content) may have very different sizes. If you were to gather up all the data you can access by the Freedom of Information Act and convert it all to PDF I'm confident that it would not fit on a 4GB flash drive.



So, what's the general ratio?  How much can a single megabyte of PDF document text be compressed into?


----------



## PohTayToez

I don't think that there is a limit.  You can insert pictures of any resolution into a PDF.  You would be able to put in a 100GB image, and all the data would still be there because you would be able to keep zooming in on the PDF to see the fine detail, but if you printed it out it obviously wouldn't show the same level of detail.


----------



## dstebbins

PohTayToez said:


> I don't think that there is a limit.  You can insert pictures of any resolution into a PDF.  You would be able to put in a 100GB image, and all the data would still be there because you would be able to keep zooming in on the PDF to see the fine detail, but if you printed it out it obviously wouldn't show the same level of detail.



That seems to conflict with your (and, by "your," I mean "you guys," as in, plural) claim earlier.

If there is no limit, then, theoretically, you can compress all of the government's records (state, federal, and even all of the thousands of local governments, combined) into a single kilobyte, without any reduction in quality.


----------



## PohTayToez

Sorry, I should have clarified.  I'm talking about the original question, and I'm referring to there being no limit to the size of a single page in a PDF.  

As for the limit to compressing a single page of a PDF, that definitely varies. Texext compresses different than images, and different images can be compressed more than others, so there isn't just one answer, it could vary greatly depending on what was in the PDF file.


----------



## dstebbins

Well, I need this for a business plan.  I want to store most of my records electronically to save trees.  That, and a flash drive, even without compression, can hold a LOT more information than the same amount of money spent on computer paper and printer ink.

So, I don't have any specific examples right now.  I need to pitch this to investors.  So, just take the one that is hardest to compress and calculate with that, for the purposes of being generous.


----------



## Geoff

If you are looking for text only, then it's easier to compare, but once you start having photos and graphics in the document, it makes it virtually impossible to come up with a standard file size per page.

I will show you a few comparisons later this morning.


----------



## dstebbins

[-0MEGA-];1480310 said:
			
		

> If you are looking for text only, then it's easier to compare, but once you start having photos and graphics in the document, it makes it virtually impossible to come up with a standard file size per page.
> 
> I will show you a few comparisons later this morning.


If you're talking about compression, don't.

Compression makes the estimates too generous.

Here, let me give you something to work with.

My computer is currently running at 1440 x 900 pixel resolution; the highest my computer can go.

Take a screen shot on that resolution, do NOT compress it, and let's use that as the standard.


----------



## Geoff

dstebbins said:


> If you're talking about compression, don't.
> 
> Compression makes the estimates too generous.
> 
> Here, let me give you something to work with.
> 
> My computer is currently running at 1440 x 900 pixel resolution; the highest my computer can go.
> 
> Take a screen shot on that resolution, do NOT compress it, and let's use that as the standard.


I really don't know what you mean by that, taking a screen shot at 1440x900 and importing it into a PDF is not going to give you the same file size as if it were a full page of text.

Here's an example, I downloaded a product manual that is 176 pages, and it takes up 5.7MB, which is an average of 33.16KB per page.  However, what I am trying to say is that you can compress this mostly text manual to take up less space, so there isn't a set amount of space that one page of a PDF takes up, every document is different.

I took a 14 page text-only PDF that was originally 157KB (11.21KB per page), and compressed it even further to 133KB (9.5KB per page).


----------



## tlarkin

You can compress text exponentially.  For example I just compressed a 14 gig dump of a mysql database into a zip file and it compressed it from 14gigs to 800 megs.  However, that file is all 100% just text.


----------



## dstebbins

[-0MEGA-];1480368 said:
			
		

> I really don't know what you mean by that, taking a screen shot at 1440x900 and importing it into a PDF is not going to give you the same file size as if it were a full page of text.
> 
> Here's an example, I downloaded a product manual that is 176 pages, and it takes up 5.7MB, which is an average of 33.16KB per page.  However, what I am trying to say is that you can compress this mostly text manual to take up less space, so there isn't a set amount of space that one page of a PDF takes up, every document is different.
> 
> I took a 14 page text-only PDF that was originally 157KB (11.21KB per page), and compressed it even further to 133KB (9.5KB per page).



Ok, so, would 64kB per page be a generous enough estimate to put on a business plan, considering that I _have_ to use hypotheticals?


----------



## Geoff

dstebbins said:


> Ok, so, would 64kB per page be a generous enough estimate to put on a business plan, considering that I _have_ to use hypotheticals?


I would use the 33KB estimate, as I didn't compress that.  If you want to be sure, download some text-only PDF's from various websites, get the average size per page, and use that. :good:


----------

