Character Coding Schemes

You need to be familiar with character coding schemes in order to select an appropriate string routine from the many available. A Pascal string can be treated as an array of characters, and this enables you to write code giving the same results as the built-in routines.

The original ASCII character coding scheme used 7 bits to represent characters 0 - 127. These codes have survived in most later schemes. Extended ASCII (sometimes called ANSI), uses 8 bits, enabling 256 characters to be represented. We output the printable characters (shown below) in the following program. Be prepared for variability in 'ANSI' character sets.

program ShowPrintableChars;
  {$APPTYPE CONSOLE}
uses
  SysUtils;
var
  Count : integer;
begin
  writeln;
  for Count := 1 to 6 do
    begin
      write(Count : 5, ': ', chr(Count));
    end;
  for Count := 14 to 254 do
    begin
      if Count MOD 8 = 0 then
        writeln;
      write(Count : 5, ': ', chr(Count));
    end;
  readln;
end.

A screenshot of the some of the output from a Raspberry Pi (with a Debian Linux operating system) follows a screenshot of the output from a PC (with a Windows OS).

Output from program ShowPrintableChars

Output on Windows

Output on Linux

Output on Linux

Note how Linux displays the hex ordinal values of some of the characters.

For languages that require more than 256 characters there are "multi byte character sets" (MBCS ANSI). These require more complicated processing because the number of bytes to represent a character is variable. Your character set on computers using Windows operating systems depends on the country setting (locale). When you use a string it will usually be an ansiString, even if you have declared it as a string. It can hold about 2^31 characters, which is likely to be sufficient for your needs! Many of the Pascal identifiers for routines have the prefix 'ansi'. Do not be put off because you thought you were using ASCII; use them with confidence!

If you include the length of a string in your declaration e.g. var Surname[15];, the string will be a shortString. However, you can combine these with ansiStrings in expressions and the Pascal compiler will handle the differences behind the scenes. Do remember that you must include the length when the string is part of a record to be used in a file of records.

Unicode is gaining popularity because the 16 bits it uses per character can represent 2^16 characters. A 16-bit character is called WideChar in Pascal and the characters are combined to form a WideString. Avoid routines with the prefix 'wide' in your console applications; we understand that it is not easy to use Unicode in console programs).

Routines in other languages that Pascal needs to use are likely to require strings in the PChar format. The P stands for 'pointer' and PChar is a pointer to the location of the string in memory. There is a function named PChar that can be used to typecast a string from type ansiString to type PChar. However, is more straightforward to declare as type PChar the string you will be using as an argument. We demonstrate this in program RunApplication. For your convenience, the program creates a test HTML file in your program folder before using the ShellExecute procedure to open it with your browser. If you want to run an application without supplying a filename to open, you need to assign the application's pathname to the variable Filename. See our code (commented out) which opens Internet Explorer on our computers.

program RunApplication;
  {$APPTYPE CONSOLE}
uses
  SysUtils, ShellAPI, Windows;
var
  Filename : PChar;
  Handle : HWND;
  
procedure SaveHTML;
var
  HTMLFile : textFile;
begin
  assignFile(HTMLFile, 'Test.html');
  rewrite(HTMLFile);
  writeln(HTMLFILE, '<html>');
  writeln(HTMLFILE, '<body>');
  writeln(HTMLFILE, '<h1>Test passed!</h1>');
  writeln(HTMLFILE, '</body>');
  writeln(HTMLFILE, '</html>');
  closeFile(HTMLFile);
end;

begin
  SaveHTML;
  Filename := 'Test.html';
  //You could also edit the following line according to your system.
  //FileName:= 'c:\Program Files\Internet Explorer\Iexplore.exe';
  ShellExecute(Handle, 'open', Filename, nil, nil, SW_SHOW);
end.  

Experimenting with Characters

  1. Use some of the symbols to output mathematical equations e.g.
    62 = 36, 56 ÷ 2 = 28, 64 = 8, sin(30º) = ½

  2. Write a program to convert each lower case letter in a string entered by the user to upper case.
  3. Write a program to write few lines of text to a .txt file then use the ShellExecute procedure to open it.
Programming - a skill for life!

Character coding schemes, string routines and text file processing