Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Penn State Computing with Foreign Symbols HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips Tips for Developing Non-English Web Sites This section discusses how tips and strategies on reading and developing non-English Web sites. Developing Properly Encoded Websites These methods are recommended for any language encoded in Latin 1 such as Spanish, French, German and Italian as well as for other major world languages including Chinese, Japanese, Korean, and Russian. 1. Declare The Encoding. 2. Declare the Language 3. Encoding, Fonts, Recommended Browsers by Language 4. HTML Special Entity Codes (Latin-1 only - See encoding chart to check.) 5. Tips for Front Page (Windows) 6. Tips for Dreamweaver 7. Export text from International Word-Processors 8. Text Alignment Workarounds Sometimes, especially when you are working with a language with relatively few speakers, you may need to use alternate methods to deliver content. 1. PDF Files 2. Using Image Files 3. About Using the <FONT FACE> tag 4. ASCII Substitution ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: January 2, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/ [3/5/2002 3:33:00 PM] Penn State Computing with Foreign Symbols HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Declare the Encoding Declare the Encoding If you create a Web site, it is good practice to declare the encoding. Properly encoded Web pages declare the encoding to a broswer through a meta tag in the header. Some examples are given below. If you are not sure which encoding system to declare, you may want to refer to the encoding by language chart or look at which system is declared in other Web sites written in the language. Sample Encoding Declarations Template <head> <meta http-equiv="Content-Type" content="text/html; charset=???"> ... </head> Declare Latin-1 (English & Western Europe) <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> ... <head> NOTE: It IS good practice to declare the encoding even for an English Web site. One function of this is to tag is to "reset" the browser back to Latin-1 and ensure proper font settings. A browser that is not reset to Latin 1 display unusual font effects after it leaves a non-Latin-1 site. Declare Windows-1252 (default in Front Page) <head> <meta http-equiv="Content-Type" content="text/html; charset=win-1252"> ... <head> NOTE: FrontPage actually encodes English Web sites not in "ISO-8859-1" (Latin-1), but in the very similar "Windows-1252". In most cases the results will be the same, but there may be an occassional differences between the character specified by Windows-1252 and by Latin-1. Declare Unicode (UTF-8 version) http://cac.psu.edu/ets/presentations/international/web/tips/declare.html (1 of 2) [3/5/2002 3:33:01 PM] Penn State Computing with Foreign Symbols <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> ... <head> If no encoding is declared, then the browser uses the default setting, which in the U.S. is typically Latin-1. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/declare.html (2 of 2) [3/5/2002 3:33:01 PM] Penn State Computing with Foreign Symbols HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Declare the Language Declare the Language The <lang=> attribute can be used to declare the language of a Web page or a portion of a Web page. This is meant to assist search engine spiders, page formatting and screen reader technology. NOTE: You must also declare the encoding in addition to the language. The language and its script are independent. Page Language The official W3C recommendation is to declare the primary language for each Web page with a <...lang => attribute in the <html> tag. Codes are ISO-636 codes. For instance: Template <html lang="??"> ... </html> English (U.S.) <html lang="en-US"> ... </html> English (U.K./Great Britain) <html lang="en-GB"> ... </html> Spanish <html lang="es"> ... </html> Masai <html lang="mas"> ... </html> http://cac.psu.edu/ets/presentations/international/web/tips/langtag.html (1 of 3) [3/5/2002 3:33:01 PM] Penn State Computing with Foreign Symbols Switching Languages If you switch languages within one page, you can embed the <lang=> attribute in other tags such as a <p>, <h1>, <span> and other tags. For example Text This sentence is in English. Esta frase es en español. (Spanish) Mae'r frawddeg hon yn cymraeg. (Welsh) Code <p>This sentence is in English.</p> <p lang="es">Esta frase es en español.</p> (Spanish) <p lang="cy">Mae'r frawddeg hon yn cymraeg.</p> (Welsh) Language Codes Language codes are primarily taken from the list of ISO-639 language codes. This list has recently been expanded to a three letter set (e.g. "eng" for English), from an older two-letter set. If a language has both a three-letter code (e.g. "eng") and a two-letter code ("en"), then use the two-letter code. If there is is only a three-letter code (e.g. "mas" for Masai), then use that code. Note that language codes are in lower case. Languages can also have an optional regional code (usually an ISO-3166 country code) if more information about dialect is needed. Note that country codes are in all caps. XHTML In XHTML, the language is declared in the <head> as follows: <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> Links W3C Reccomendations ● http://www.w3.org/International/O-HTML-tags.html ISO-639 Language Codes ● http://www.loc.gov/standards/iso639-2/langcodes.html (Full List) ● http://babel.alis.com/langues/iso639.en.htm (2-Letter only) ISO-3166 Country Codes ● http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1/en_listp1.html Top of Page ©Penn State University, 2001, 2002. http://cac.psu.edu/ets/presentations/international/web/tips/langtag.html (2 of 3) [3/5/2002 3:33:01 PM] Penn State Computing with Foreign Symbols This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: January 2, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/langtag.html (3 of 3) [3/5/2002 3:33:01 PM] Browsers, Fonts, Encodings by Language HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Encoding & Fonts by Language Encoding & Fonts by Language Below is a list of common scripts with their encodings and recommended fonts. Scripts: Roman = English Alphabet, Cyrillic = Russian Alphabet Font Needed: N/A = no special fonts needed, <Language Kit> - generated by Mac Language Kit Best Browsers: Although recommendations often refer to Netscape or Internet Explorer, other browsers such as Mozilla or Opera could be usable. Languages Taught at Penn State Language Arabic Font Needed Win: Arial Unicode, Arabic Transparent, Tahoma, etc. Encodings Windows-1256 ISO-8859-6 Others Mac: AB Cairo, AB Geeza, AB Baghdad, AB Nadeema, etc. Chinese (Traditional) Win: MS Hei, Arial Unicode, etc. Mac: Taipei, Hei, etc. Chinese (Simplified) Win: MS Song, Arial Unicode, etc. Mac: Beijing, Song, etc. Best Browsers Win: Internet Explorer 5, Netscape 6 Mac: Netscape 6, iCab Big5 EUC-TW Others Win: Recent Netscape or Internet Explorer. Mac: Most recent browsers plus Language Kits. If you cannot install Language Kits, then use Netscape 4.7. GB2312 GBK Others Win: Recent Netscape or Internet Explorer. Mac: Most recent browsers plus Language Kits. If you cannot install Language Kits, then use Netscape 4.7. http://cac.psu.edu/ets/presentations/international/web/tips/encoderef.html (1 of 5) [3/5/2002 3:33:02 PM] Browsers, Fonts, Encodings by Language French N/A ISO-8859-1 Most Browsers German N/A ISO-8859-1 Most Browsers Greek (Modern) Win: Recent versions of Times New Roman, Arial, Tahoma, Comic Sans, Arial Unicode ISO-8859-7 Windows-1253 Win: Recent Netscape or Internet Explorer. Mac: Internet Explorer 5, Netscape 6. Mac: Language Kits must be installed NOTES Hebrew 1. Mac computers manually "draw" Greek characters. Not all characters may be rendered accurately. 2. There is not wide support of certain Ancient Greek accents. Read any Ancient Greek Web site carefully for additional instruction Win: Arial Unicode, David, David Transparent, Fixed Miriam Transparent ISO-8859-8 Windows-1255 Win: Internet Explorer 5, Netscape 6 Mac: Netscape 6, iCab Mac: Arial Hebrew, Corsiva Hebrew, HB Arial, etc. Italian N/A ISO-8859-1 Most Browsers Japanese Win: Arial Unicode, MS Gothic Shift_JIS EUC-JP Others Win: Recent versions of Netscape or Internet Explorer Mac: Osaka and Osaka Å..., etc Korean Win: GulimChe, Arial Unicode Mac: Most recent browsers plus Language Kits. If you cannot install Language Kits, then use Netscape 4.7. EUC-KR Mac: Seoul, etc Latin N/A Win: Recent Netscape or Internet Explorer. Mac: Most recent browsers plus Language Kits. If you cannot install Language Kits, then use Netscape 4.7. ISO-8859-1 All Browsers http://cac.psu.edu/ets/presentations/international/web/tips/encoderef.html (2 of 5) [3/5/2002 3:33:02 PM] Browsers, Fonts, Encodings by Language NOTES There is not wide support of Latin long vowel marks. Most Web sites use Latin without long marks. If you need long marks, try encoding the page in ISO-8859-13 or Windows-1257. See the tips section for tips on developing on non Latin 1 Web sites for more details. Old English/Icelandic Win: N/A ISO-8859-1 Mac: Language Kit needs to be installed Win: Recent Netscape or Internet Explorer. Mac: Internet Explorer 5, Netscape 6 (ð and þ may be displayed inconsistently) Portuguese N/A ISO-8859-1 Most Browsers Russian Win: Recent versions of Times New Roman, Arial, Tahoma, Arial Unicode Windows-1251 KOI-8 Win: Most Browsers. Mac: Most recent browsers plus Language Kits. If you cannot install Language Kits, then use Netscape 4.7. Mac: Geneva CY, Times CY, Helvetica CY, Monacao CY, PrimaProj, Latinskij, etc. Spanish N/A ISO-8859-1 Most Browsers Swahili N/A ISO-8859-1 All Browsers This list contains other languages not necessarily taught at Penn State. It is not complete, by any means. If anyone has a question about a particular language, please send an e-mail to Elizabeth Pyatt (ejp10@psu.edu). Other Languages Language Cyrillic Font Needed? Win: Recent versions of Times New Roman, Arial, Tahoma, Arial Unicode Encoding Windows-1251 KOI-8-R KOI-8-U (Ukranian) Others Mac: Geneva CY, Times CY, Helvetica CY, Monacao CY, PrimaProj, Latinskij, etc. http://cac.psu.edu/ets/presentations/international/web/tips/encoderef.html (3 of 5) [3/5/2002 3:33:02 PM] Best Browsers Win: Most Browsers. Mac: Most recent browsers plus Language Kits. If you cannot install Language Kits, then use Netscape 4.7. Browsers, Fonts, Encodings by Language Central Europe Win: Recent versions of Times New Roman, Arial, Tahoma, Arial Unicode ISO-8859-2 (Latin 2) Windows-1250 Others Mac: Geneva CE, Times CE, Helvetica CE, etc. LANGUAGES Lithuanian and Lativian (Baltic) Win: Recent versions of Times New Roman, Arial, Tahoma, Arial Unicode Win: Arial Unicode, AngsanaNew, Tahoma Windows-1257 Win: Internet ISO-8859-13 (Latin 7) Explorer 5, ISO-8859-4 (Latin 4) Netscape 4.7, Netscape 6 Mac: Internet Explorer 5, Netscape 6 TIS-620 Mac: Third Party font Win: Recent versions of Times New Roman, Arial, Tahoma, Arial Unicode Turkish Mac: Most recent browsers plus Language Kits. If you cannot install Language Kits, then use Netscape 4.7. Includes - Croatian (Serbo-Croatian in Roman alphabet), Czech, Hungarian, Polish, Romanian, Slovak, Slovenian Mac: Language Kits should be installed Thai Win: Most Browsers. Win: Internet Explorer 5, Netscape 6 Mac: Netscape 6 ISO-8859-9 Windows-1254 Most Browsers Mac: <Language Kit> India & Sri Lanka NOTES Western Europe, etc. See each Website N/A Most Browsers Many Web sites from India and Sri Lanka provide free fonts to download. Check each Web site for instructions. N/A ISO-8859-1 http://cac.psu.edu/ets/presentations/international/web/tips/encoderef.html (4 of 5) [3/5/2002 3:33:02 PM] Most Browsers Browsers, Fonts, Encodings by Language Europe - Albanian, Basque, Catalan, Danish, Dutch, English, Faroese, Finnish, French, Gaelic (Scots), Galician, German, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish, Swedish. LANGUAGES Some Baltic languages are not included Elsewhere - Swahili and many Bantu languages, Hawiian and many Polynesian languages, many native American languages, Afrikaans Does NOT include Vietnamese Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/encoderef.html (5 of 5) [3/5/2002 3:33:02 PM] Penn State Computing with Foreign Symbols - FrontPage (PC) HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Front Page Tips Using Front Page (PC) Dreamweaver Tips The Technique Many developers use Microsoft FrontPage on a PC in conjunction with the Windows keyboard utilities to create non-English Web sites. This is an effective tool, but care must be taken not to make the pages incompatible outside a PC computer. Tools Needed Users need to have fonts compliant with the encoding. Configuring FrontPage Developers need a recent version of Microsoft FrontPage, the relevant fonts installed, and the Windows keyboard for that language or script installed and activated. To configure Front Page: NOTE: These instructions are for the Windows version of FrontPage. 1. Follow the instructions for activating a Windows keyboard through the Regional Options Control Panel. 2. Open a new document in FrontPage. 3. Follow the instructions for switching Windows keyboards or "Input Locales". 4. You should be able to type in the foreign script in FrontPage. When to Use it This is best used for extended passages of scripts such as Cyrillic, Chinese, Japanese, Korean, Arabic, or Hebrew which are widely supported in browser preferences. Potential Pitfalls 1. The HTML code must be inspected for extraneous or vendor-specific tags and modified accordingly. In particular stray <FONT FACE> tags or style-sheet commands could make a file incompatible on certain browsers and platforms. Whenever possible, avoid using any <FONT FACE> tags or specifying fonts through a style sheet. Let the browser match the font with the encoding. http://cac.psu.edu/ets/presentations/international/web/tips/frontpage.html (1 of 2) [3/5/2002 3:33:03 PM] Penn State Computing with Foreign Symbols - FrontPage (PC) 2. The output for some scripts, such as Arabic, may not be correct. In those cases, another method is recommended. 3. For U.S. audiences, it is best to provide instructions to users on how to configure their browsers. 4. Unfortunately, some scripts may be so undersupported that there may not be a viable encoding system available. In these cases another option should be used. 5. FrontPage will declare the encoding with the appropriate Microsoft Windows encoding scheme with a meta tag. <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=win-1251"> (Cyrrilic Windows) </HEAD> In most cases, that is a preferred encoding, but there may be occassional exceptions depending on language or script. NOTE: FrontPage actually encodes English Web sites not in "ISO-8859-1" (Latin-1), but in the very similar "Windows-1252". In most cases the results will be the same, but there may be an occassional differences between the character specified by Windows-1252 and by Latin-1. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/frontpage.html (2 of 2) [3/5/2002 3:33:03 PM] Penn State Computing with Foreign Symbols - Dreamweaver HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Dreamweaver Tips Using Dreamweaver FrontPage (PC) Tips The Technique If you wish to use Dreamweaver, it is suggested that you first export text from an international word processor into an HTML file, then modify it in Dreamweaver. Tools Needed Users need to have fonts compliant with the encoding. Configuring Dreamweaver Developers need a recent version of Dreamweaver and the relevant fonts installed. You should also configure Dreamweaver to work with a non-English HTML file. 1. Open Dreamweaver, then under the Edit menu, choose Preferences to open the Preferences window. 2. In the Category menu to the left, select Fonts/Encoding. http://cac.psu.edu/ets/presentations/international/web/tips/dreamweaver.html (1 of 3) [3/5/2002 3:33:04 PM] Penn State Computing with Foreign Symbols - Dreamweaver 3. In the Font Settings menu to the right, choose an appropriate script (e.g. "Cyrillic"). Be careful not to choose Default Encoding. 4. Select an appropriate font which matches that script from the Proportional Font, Fixed and HTML Inspector pull-down menus. Click OK to shut the window. 5. Open a document which is encoded in a non-English script. The characters should be in that script, even in the HTML Source window. When to Use it This is best used for extended passages of scripts such as Cyrillic, Chinese, Japanese, Korean, Arabic, or Hebrew which are widely supported in browser preferences. Dreamweaver can be benefical if, for some reason, you wish to avoid a Windows encoded file. Potential Pitfalls 1. Make sure an encoding is declared in a meta tag such as the one listed below. <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8"> (Unicode) </HEAD> 2. The HTML code should be inspected for extraneous or vendor-specific tags and modified accordingly. In particular stray <FONT FACE> tags or style-sheet commands could make a file incompatible on certain browsers and platforms. Whenever possible, avoid using any <FONT FACE> tags or specifying fonts through a style sheet. Let the browser match the font with the encoding. 3. For U.S. audiences, it is best to provide instructions to users on how to configure their browsers. http://cac.psu.edu/ets/presentations/international/web/tips/dreamweaver.html (2 of 3) [3/5/2002 3:33:04 PM] Penn State Computing with Foreign Symbols - Dreamweaver 4. For languages whose encoding systems are not widely supported by browsers, the text editor and Dreamweaver can still be used to develop the web page, but you will need to take extra steps to provide information on recommended browsers and fonts. 5. Unfortunately, some scripts may be so undersupported that there may not be a viable encoding system or text editor available. In these cases another option should be used. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/dreamweaver.html (3 of 3) [3/5/2002 3:33:04 PM] Penn State Computing with Foreign Symbols - Export from Word Processor HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> International Word Processor Export HTML Text from an International Word Processor The Technique The ideal for developing a non-Roman site is to encode text in a standard character encoding system for a given script. The easiest way to do that is to purchase a text editor or word processing program designed for that script. This encoded text can then be exported as an HTML file. Tools Needed Users need to have fonts compliant with the encoding. Developers need a text editor or word processor developed for a specific script which includes an export to HTML utility. For instance, the Global Writer international word-processor allows export into HTML for many scripts. Other script-specific word-processors such as Chinese Star may include a similar export to HTML utility. In some cases you will need to select an appropriate encoding for the script. NOTE: Because of Microsoft formatting issues, export from Microsoft Word is not recommended When to Use it This is best used for extended passages of scripts such as Cyrillic, Chinese, Japanese, Korean, Arabic, or Hebrew which are widely supported in browser preferences. Potential Pitfalls 1. Make sure the HTML file declares the encoding system at the beginning of the HTML file. <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ENCODING HERE"> </HEAD> 2. Any HTML code should be inspected for extraneous or vendor-specific tags and modified accordingly. In particular stray <FONT FACE> tags or style-sheet commands could make a file incompatible on certain browsers and platforms. Whenever possible, avoid using any <FONT FACE> tags or specifying fonts through a style sheet. http://cac.psu.edu/ets/presentations/international/web/tips/export.html (1 of 2) [3/5/2002 3:33:04 PM] Penn State Computing with Foreign Symbols - Export from Word Processor Here's an example from Swarthmore of how to tweak exported Chinese text in HTML with Claris Homepage. 3. For U.S. audiences, it is best to provide instructions to users on how to configure their browsers. 4. For languages whose encoding systems are not widely supported by browsers, the text editor can still be used to develop the web page, but you will need to take extra steps to provide information on recommended browsers and fonts. 5. Unfortunately, some scripts may be so undersupported that there may not be a viable encoding system or text editor available. In these cases, other options should be used. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/export.html (2 of 2) [3/5/2002 3:33:04 PM] Penn State Computing with Foreign Symbols - Text Alignment HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Text Alignment Text Alignment Options A. Right Aligned Text Text enclosed in the tags <DIV ALIGN="RIGHT">...</DIV> will be right aligned. HTML Code <DIV ALIGN="RIGHT"> <P>Look for me WAAAAAY on the right.</P> </DIV> Result Look for me WAAAAY on the right. Web sites geared towards devleoping Hebrew or Arabic sites discuss other strategies, but they may not work on all browsers. One of these Web sites is www.microsoft.com/globaldev/articles/mideast.asp. B. Vertical Text For purposes of parsing textual data and screen reader access, it's best to use horizontal text whenever possible. If you absolutely need vertical text, using image files or PDFs are the best alternatives. In the near future, style sheet options, such as those discussed in the proposed ruby system will allow developers to generate vertical text pages for the Web. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/align.html [3/5/2002 3:33:04 PM] Penn State Computing with Foreign Symbols - PDF Files HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> PDF Files PDF Files This section discusses how tips and strategies on reading and developing non-English Web sites. The Technique PDF (Portable Document Format) format files are readable and printable to all users, regardless of what fonts may be installed on their computer. Adobe Acrobat can be used to convert files composed in a word-processor such as MS Word to PDF. Tools Needed Site users must have Acrobat Reader (http://www.adobe.com/products/acrobat/readstep.html) FREE FROM ADOBE. NOTE: All Student Computing Lab machines have Acrobat Reader installed onto them. Developers must own PDF Acrobat (full version), Acrobat Writer or Acrobat Distiller, all available from Adobe. This software may purchased from the Penn State MOC. When to Use It PDFs are excellent for longer documents, and can preserve a great detail in formatting & graphic information. PDFs may be a good solution for developers with a large archive of foreign-language documents in word-processing format. On the other hand, PDF's are not optimal for short passages. Best of all, you can use any font in a PDF (print or Internet). Potential Pitfalls PDF technology is widely used and supported by the Penn State community, but here are a few minor quirks to look out for. 1. Once files are in PDF format, they are difficult to edit. Always keep the original text file on hand, in case you need to make changes. 2. Not all fonts are "licensed" for PDFs. In these cases you need to find another similar font which is licensed. 3. Students with older computers may need to download Acrobat Reader in order to use your files - many sites warn users of the need for the PDF Acrobat Reader, then point them to Adobe's Web site. Adobe will even let Web designers use their download graphic on a Web http://cac.psu.edu/ets/presentations/international/web/tips/pdf.html (1 of 2) [3/5/2002 3:33:05 PM] Penn State Computing with Foreign Symbols - PDF Files site. 4. PDF files must be downloaded onto the user's machine. If they are large, they may be slow to download over a modem connection. Many Web sites list file sizes, so users are aware of potentially long download times. 5. Strictly speaking, PDF files do not always meet "accessibility" requirements. Visually-impaired students with screen readers may not be able to read PDF files. For some minority languages, this may be a moot point, but for some languages like Spanish or French, the issue may be more critical. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/pdf.html (2 of 2) [3/5/2002 3:33:05 PM] Penn State Computing with Foreign Symbols - Using Image Files HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Image Files Using Image Files The Technique Use a graphic image in .gif (GIF) format of the desired text. This also allows for maximum control over the visual appearance of a piece of text. Below are some examples of how to incorporate GIF images. Buttons These buttons show "PSU" in Runic, Cyrillic and Cherokee. Buttons do NOT work. (Author does not guarantee 100% accuracy of transliteration.) Runic "PSU" Cyrillic "PSU" Cherokee "PSU" GIF's Masquerading as Text This list combines Latin-1 Text with Image Files. Again links are non-functional. ● PSU (This is text) ● (Image of Runic PSU) (Image of Cyrillic PSU) ● (Image of Cherokee PSU) ● This technique is often used in sites with links to translated pages. The Non-Roman script images have to be manually colored the same as your link color with the underline inserted. In addition, care must be taken with the layout to preserve the illusion of "textness", especially across platforms. See below for a not-so-good example. PSU | | | Tools Needed Users only need a graphical browser such as Netscape or Internet Explorer, making images nearly universal. http://cac.psu.edu/ets/presentations/international/web/tips/gif.html (1 of 2) [3/5/2002 3:33:06 PM] Penn State Computing with Foreign Symbols - Using Image Files Developers need a graphics program (Photoshop, Painter, file converters) which to generate or convert files in the .gif format. When to Use It This is best used for small pieces of foreign language text such as buttons, stray non Latin-1 glyphs or short link text. Graphic buttons are often used point users to properly encoded Web pages or PDF files written in the relevant language. Potential Pitfalls 1. Remember to include the alt="text" and title = "text" attribute when inserting image files into HTML. The ALT attribute is useful for users who have turned off their graphics or who rely on synthesized screen readers. HTML Code <img src="../../graphics/IPAschwa.gif"> <img src="../../graphics/IPAschwa.gif alt="schwa" title="schwa"> Result (Graphics Result (Graphics Enabled) Disabled) -IMAGE-schwa- 2. Use .gif files instead of .jpg files, which are better suited to photographs. 3. Once files are in .gif format, the ability to edit type is lost. Always keep the original graphics file on hand, in case you need to make changes. 4. Graphics files are not scalable. An in-text graphics file that looks fine in one platform may look out of scale on another. Some minor adustments may be necessary. 5. Print quality of an image file of a glyph is generally lower than a textual equivalent. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/gif.html (2 of 2) [3/5/2002 3:33:06 PM] Penn State Computing with Foreign Symbols - Font Face Tag HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> Font Face Tag Using the <FONT FACE> tag The Technique Use of a print font (such as the font Symbol for Greek text) instead of an Internet to display foreign language material. The font is specified in HTML with the tags <FONT FACE="(font name)"> </FONT>. This technique is not well-received because of implementation problems, the most important of which is that if the user can't download the right font on their machine, the site is unreadable. This technique works best for Web sites which can reasonably expect users to return. This would include user groups, news sites and possibly class sites. For instance, some news organizations offer free print fonts to their users so that the site is usable on all platforms. If you like, you can view a phonetic transcription SIL Encore phonetic transcription test file for Macintosh or a phonetic SIL Encore phonetic transcription test file for Windows. Of course, you will not be able to read these files until you download the SIL Encore IPA fonts (www.sil.org/computing/fonts/encore-ipa.html) from the Summer Institure of Linguistics (www.sil.org) available in Mac & PC formats. Incidentally, because the SIL fonts are designed for PRINT use and not INTERNET use, the glyph-character number mappings are different in the Mac & PC versions of the font. Therefore, I had to create two separate files (a MAC version and a PC version), so that the right glyphs appear in the browser. Tools Needed Developers MUST provide information to users on downloading and installing the print font (both Mac & PC) in question. Even if the font is the "same" in Windows and Mac, you may have to develop dual versions of the same text. NOTE ON CAC LABS: It is possible for students to download the font temporarily onto a MACINTOSH ONLY from a floppy disk. The font will be available on that single machine until the user logs off. Users CANNOT download fonts onto PCs. Faculty can request that fonts be installed on Student Computing Lab machines, but should be prepared to provide information about the course and font supplier and licensing. http://cac.psu.edu/ets/presentations/international/web/tips/fontface.html (1 of 2) [3/5/2002 3:33:06 PM] Penn State Computing with Foreign Symbols - Font Face Tag When to Use it This could be a way to minimize file size or clean up images for large documents written in an undersupported script such as Cherokee, Ogam, minority South Asian scripts, or a non-living language. Generally speaking, the print quality of the glyps will be better than a .gif image file of a glyph. Make sure your audience is willing to download the font (a free font is better). This may be used in conjunction with larger PDF files. Here are some Web sites which offers print fonts: ● www.info.lk/slword/swdowns.htm - Fonts for Sinhala (Sri Lanka) newsgroups ● www.perseus.tufts.edu/Help/fonthelp.html - Perseus Ancient Greek Font support The Pitfalls 1. If the user can't download and install a font, the site is useless. Provide multiple download links or on disk (to a class) if possible. 2. If you are using a characters generated by keystrokes outside the ASCII range (e.g. a Windows ALT key code or Macintosh Option key code), check to be sure the file is readable on both platforms. You may have to develop to two files. 3. Search engines and screen readers will think the Web site is Latin-1 and read it as such (resulting in a string of nonsense characters). A separate audio file or additional text description may be needed for visually impaired users. 4. If you provide a font that is not yours, you must read the font licensing conditions. Many can be distributed free for non-commercial use, but there may be additional restrictions. Dynamic Font Technology Some Web sites use "dynamic font" technology in which a specific font is automatically downloaded onto a user's computer. However, both Netscape and Internet Explorer implement them differently and they are not cross-compatible. Typically these Web sites are viewable only in one browser. Here are some sites which use Dynamic fonts. Alabama Dictionary Characters (Native American language) Australian Phonetic Course (Netscape only) Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/fontface.html (2 of 2) [3/5/2002 3:33:06 PM] Penn State Computing with Foreign Symbols - ASCII Substution HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> Developing Tips >> ASCII Substitution ASCII Substitution The Technique Use an ASCII subsititute for a non-Latin 1 glyph. For intance Welsh Web texts replace "circumflex w" ( ) with plain "w" or "w+". Similarly, many Old Irish scholars replace the "amperagus" ( ) symbol (Old Irish "&" symbol), with just the number seven (7). ASCII substitutions for phonetic symbols are very common - here's a standardized IPA phonetic alphabet ASCII substitution key. Tools Needed Developers need a keyboard. All users will be able to see the glyph. When to Use it Only with a Roman script for cases when there may be one or two glyphs missing in ISO-8859-1. Best used as a last resort when other resources fail. Potential Pitfalls 1. Search engines and screen readers will not be able to parse it. Put a keymap (e.g. "7" = "amperagus") on the home page explaining your subsitutions. 2. You should not use this technique to replace non ASCII-glyphs (e.g. é, ¢) available through Latin 1 HTML special entity codes. Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/tips/asciisub.html [3/5/2002 3:33:07 PM] Alt Key Codes HOME ACCENTS NON-ENGLISH KEYBOARDS WEB SITES LINKS SITEMAP LOCATION: Web Sites >> HTML Special Etity Codes HTML - Special Entitiy Codes This Web page contains lists of special entity codes needed in HTML to generate special characters such as ñ, ¢, ÷ and other characters. Full instructions are in the "Using the Codes" section followed by lists organized by character type. Information on NOTE: If you are composing Web pages in an HTML editor such as Dreamweaver or FrontPage, the programs may generate the characters based on what is typed in (check the HTML to be sure). See the Accents section for more information on typing or inputting accents through a keyboard on a PC or Macintosh computer. Contents 1. Using the Codes 2. Letters with Accents - (e.g. ó, ò, ñ) 3. Other Foreign Characters - (e.g. ç, ¿, ß) 4. Currency Symbols - (e.g. ¢, £, ¥) 5. Math Symbols - (e.g. ±, °, ÷) 6. Other Punctuation - (e.g. &, ©, §) 7. Links to Other References Using the Codes To input non-English into an Web page, HTML employs a series of entity codes enclosed with an & on the left side and a ; (semi-colon) on the right. HTML SPECIAL CHARACTER TEMPLATE &(code); For example, the code for ç is "ccedil". To generate French ç in HTML, type the code ç into your HTML document as in: HTML - fran&cecedil;ais Result - français Here's another example using ¢ for ¢. HTML - It cost 5¢. Result - It cost 5¢. Some characters like œ (#156) are known by a number, not an entity code. For these http://cac.psu.edu/ets/presentations/international/web/codehtml.html (1 of 6) [3/5/2002 3:42:53 PM] Alt Key Codes characters the template is: HTML CHARACTER NUMBER TEMPLATE &#(number); For example to input sœur, the French word for sister you use the following code: HTML - sœur 'sister' Result - sœur 'sister' Top of Page Letters with Accents This list is organized by Accent type. To determine the appropriate numeric code, match the accent with the vowel. The general template for each accent is in the left column in blue. For instace &Vcirc; means that all the entity codes for vowels with circumflex accents contain "circ" as part of the code. Example 1: To input the circumflex â (â) in HTML, type in â. Exampe 2: To input circumflex ô (ô) in HTML, type in ô. Accent a/A e/E i/I o/O u/U á é í ó ú Accute á &eaccute &iaccute; &oaccute; &uaccute; &Vaccute; Á É Í Ó Ú &Aaccute; &Eaccute; &Iaccute; &Oaccute; &Uaccute; â ê î ô û Circumflex â ê î ô û &Vcirc; Â Ê Î Ô Û Â Ê Î Ô Û à è ì ò ù Grave à è ì ò ù &Vgrave; À È Ì Ò Ù À È Ì Ò Ù ã ñ õ Tilde ã ñ õ &Vtilde; Ã Ñ Õ Ã Ñ Õ ä ë ï ö ü Umlaut ä ë ï ö ü &Vuml; Ä Ë Ï Ö Ü http://cac.psu.edu/ets/presentations/international/web/codehtml.html (2 of 6) [3/5/2002 3:42:53 PM] Alt Key Codes Ä Ë Ï Ö Ü If you are having problems inputting these codes, please review the instructions for using the codes on top of this Web page. Top of Page Other Foreign Characters Example 1: To generate the upside-down question mark ¿, type ¿ into the HTML code. Example 2: To generate French oe ligature œ, type œ into the HTML code. SYMBOL ¡ ¿ ç,Ç œ,Œ ß ø,Ø å,Å æ,Æ , , «» CODE NOTES ¡ ¿ ç Ç œ Œ ß ø Ø å Å æ Æ þ Þ ð Ð « » This is Spanish style quote mark. If you are having problems inputting these codes, please review the instructions for using the codes on top of this Web page. http://cac.psu.edu/ets/presentations/international/web/codehtml.html (3 of 6) [3/5/2002 3:42:53 PM] Alt Key Codes Top of Page Currency Symbols Example: To generate the cent sign ¢, type ¢ into the HTML code. SYMBOL CODE ¢ £ ¥ NOTES ¢ £ British Pound ¥ Japanese Yen ¤ Generic currency symbol If you are having problems inputting these codes, please review the instructions for using the codes on top of this Web page. Top of Page Math Symbols Example: To generate the division sign ÷, type ÷ into the HTML code. SYMBOL CODE ÷ NOTES ÷ ° ¬ ± µ ° Degree symbol ¬ Not symbol ± µ Micro If you are having problems inputting these codes, please review the instructions for using the codes on top of this Web page. Top of Page http://cac.psu.edu/ets/presentations/international/web/codehtml.html (4 of 6) [3/5/2002 3:42:53 PM] Alt Key Codes Other Punctuation Example 1: To generate the and symbol & (&) type in &. Example 2: To generate the string & in HTML, type &amp;. SYMBOL (blank space) > < & " © ® ™ ¶ • § – — CODE NOTES Inserts a blank space < > & " Regular quotes are fine, but avoid "Smart Quotes" © ® ™ Trademark ¶ Paragraph Symbol • List Dot § Section Symbol – en-dash — em-dash If you are having problems inputting these codes, please review the instructions for using the codes on top of this Web page. Top of Page Links to External Reference Pages Ian S. Graham (Wiley) www.wiley.com/legacy/compbooks/graham-quin/html4ed/appa/en_test.html Webmonkey - hotwired.lycos.com/webmonkey/reference/special_characters/ http://cac.psu.edu/ets/presentations/international/web/codehtml.html (5 of 6) [3/5/2002 3:42:53 PM] Alt Key Codes Avoid the first set of entries ("left single quote" to "trademark sign") - these are not widely supported across browsers Top of Page ©Penn State University, 2001, 2002. This Web page is maintained by Elizabeth J. Pyatt (ejp10@psu.edu) for the Center for Education Technology Services. Last Modified: March 9, 2002. This publication is available in alternate media upon request. http://cac.psu.edu/ets/presentations/international/web/codehtml.html (6 of 6) [3/5/2002 3:42:53 PM]