Wednesday, December 03, 2008

Practical Tips for Document Word Count

As a freelancer, you will be receiving files in different file formats for translation. Since you are generally paid according to the number of words in the source text you would generally be expected to produce an invoice which states the number of words that are present in the original text and calculate the exact amount that needs to be paid multiplying the number of words with the amount paid per word.

The problem arises if the file sent to you is a PDF document. If the PDF file has not been sourced from a scan, it is possible to convert this into text file and calculate the number of words. Since most PDF documents are scanned, it is not possible to calculate the exact number of words as it can be done in a .DOC file or a text file.

Here are few practical tips that will help you if you are faced up with such a situation.

1. The easiest method is to request your client to send in a copy of the file that is in an editable version. But, in many cases this is not possible as it may not be available in this form.
2. Another simple alternative to use is to inform the client prior to taking up the project that you will be charging by the hour as it is not possible to get an exact word count.
3. Another practical method to get an approximate of the number of words is to randomly count the number of words in few particular lines and arrive at an average word count per line and then multiply it with the number of lines in the document to arrive at a final figure.
4. Another simple method is to inform the client that you would be taking the word count of the finished translation as it cannot be found out from source. However, this may not be accepted by many clients.
5. If you do not want to rely on any of the methods enlisted above, then you will need to invest in special software (Optical Character Recognition- OCR) to help you out. E.g.: Omnipage Pro, ReadIris Pro, Abbyy, Practicount, Anycount, etc.

These simple and practical methods will definitely solve the problems of getting a word count on any PDF file formats that you might receive in the future.