Convert Typewriter Quotes to Typographic Quotes

Here is a script in progress at the moment.

I'm posting this now since it does have some functionality, and will at least sometimes, maybe most of the time work. Current language implementations are Afrikaans (af), Albanian (sq), Belarusian (be), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Icelandic (is), Lithuanian (lt), Macedonian (mk), Polish (pl), Russian(ru), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (se), Ukrainian (uk). As you can see, some languages use the same quotes. Note: in French, there is no usage of single guillemets here, instead apostrophes will become single curly quotes.

The Logic
This is always an important issue, since regardless of what algorithms you might use, there will be exceptions. In this example script, there is no allowing for maintaining " or ' as something you might want to preserve. In English, I don't think that writing 3' 6" for 3ft 6in or even 3 feet, 6 inches would be preferable anyway, at least when you are wanting to be using typographic quotes. An exception might be for geographical coordinates, but these are uncommon.

The logic here mostly focuses on leading or following spaces, with consideration for the special situations of a quote at the very beginning or end of a frame. There also had to be some adjustments for the beginning or end of a paragraph. I believe I have it successfully modified to handle nested quotes now.

For single quotes there are contractions to deal with, so an English use approach was used, but I'm not sure that's accurate for German, let alone other languages. It also will fail for something like 'twas, and needs a fix for a quote between and letter and following comma or period, something which is sometimes done in American English.

This first version also will not track through linked frames, so you'll have to do that manually. One could, of course, change the logic to automatically scan the document for all text frames rather than just operate on the currently selected frame, but I had misgivings about this.

The logic goes something like this:
 * First we check to see if we're on the last character, so that getText doesn't fail on getting the character following the current one, and we can artificially set a nextchar value.
 * After we get the current character we have to check that it isn't a non-printing character that might have a length of 0 or greater than 1. If it is, we skip over it.
 * A double quote that is the first character will be a lead_double
 * A quote preceded by space or not followed by a space will be a lead_double – if you accidentally put double quotes between letters, I'm not sure what's right, but this is something like the situation with a following period or comma.
 * Everything else will be follow_double
 * There is a similar logic for single quotes, except that a single between non-spaces will be a follow_single.
 * I think I have fixed following periods and commas now. The main logic was that if we have a sequence "'. or "', the only logical sense is that the middle single quote must be a following quote, with similar logic for '". and '", – there could always be exceptions, but not likely. Similarly, x'. and x', have the same logic (x being any other character).
 * Having trouble now with beginning of paragraph quotes...compromise at this point is that double quotes come out Ok, single quotes don't, but the bonus is that a paragraph beginning with 'Twas comes out Ok!!

You may notice these ord(char) == 34 or 39 tests. These are the ascii codes for double and single (typewriter) quotes respectively. I felt it would be visually messy to put the quote character inside quotes, and certainly would increase the likelihood of typographical errors in the script.

Note: For your particular language you may find that replacing apostrophes with some form of curly quote doesn't work, in which case you can always change the lead_single and follow_single so that they "replace" the apostrophe with itself: lead_single = "'" follow_single = "'"

You could always delete the section for analyzing your doc for apostrophes, but then you lose the capability for other languages.

quotes.py
""" USAGE
 * 1) !/usr/bin/env python
 * 2) -*- coding: utf-8 -*-
 * 3) File: quotes.py - changes typewriter quotes to typographic quotes
 * 4) © 2010.08.28 Gregory Pittman
 * 5) This program is free software; you can redistribute it and/or modify
 * 6) it under the terms of the GNU General Public License as published by
 * 7) the Free Software Foundation; either version 2 of the License, or
 * 8) (at your option) any later version.

You must have a document open, and a text frame selected. There will be a valueDialog asking for your language for the quotes, the default is 'en', but change the default to suit your needs. Detected errors shut down the script with an appropriate message.

""" import scribus

if scribus.haveDoc: c = 0 lang = scribus.valueDialog("Choose language", 'Language: en, de, pl, se, fi, ru, af, sq,\n be, uk, es, lt, mk, is, sk, sl, and et\n are current choices','en') if (lang == 'en'): lead_double = u"\u201c" follow_double = u"\u201d" lead_single = u"\u2018" follow_single = u"\u2019" elif (lang == 'de'): lead_double = u"\u201e" follow_double = u"\u201c" lead_single = u"\u2019" follow_single = u"\u201a" elif (lang == 'fr'): lead_double = u"\u00ab" follow_double = u"\u00bb" lead_single = u"\u2018" follow_single = u"\u2019" # am hoping this will cover contractions like je t'aime elif (lang == 'pl'): lead_double = u"\u201e" follow_double = u"\u201d" lead_single = u"\u201a" follow_single = u"\u2019" elif ((lang == 'se') or (lang == 'fi')): lead_double = u"\u201d" follow_double = u"\u201d" lead_single = u"\u2019" follow_single = u"\u2019" elif (lang == 'af'): lead_double = u"\u201c" follow_double = u"\u201d" lead_single = u"\u2018" follow_single = u"\u2019" elif (lang == 'sq'): lead_double = u"\u201e" follow_double = u"\u201c" lead_single = u"\u2018" follow_single = u"\u2019" elif ((lang == 'be') or (lang == 'uk') or (lang == 'ru')): lead_double = u"\u00ab" follow_double = u"\u00bb" lead_single = u"\u2039" follow_single = u"\u203a" elif (lang == 'uk'): lead_double = u"\u00ab" follow_double = u"\u00bb" lead_single = u"\u2039" follow_single = u"\u203a" elif (lang == 'es'): lead_double = u"\u00ab" follow_double = u"\u00bb" follow_double = u"\u201d" lead_single = u"\u2018" elif ((lang == 'lt') or (lang == 'mk') or (lang == 'is') or (lang == 'sk') or (lang == 'sl') or (lang == 'et')): lead_double = u"\u201e" follow_double = u"\u201c" lead_single = u"\u2019" follow_single = u"\u201a" else: scribus.messageBox('Language Error', 'You need to choose an available language', icon=0, button1=1) sys.exit(2) else: scribus.messageBox('Usage Error', 'You need a Document open', icon=0, button1=1) sys.exit(2)

if scribus.selectionCount == 0: scribus.messageBox('Scribus - Usage Error',       "There is no object selected.\nPlease select a text frame and try again.",        scribus.ICON_WARNING, scribus.BUTTON_OK) sys.exit(2) if scribus.selectionCount > 1: scribus.messageBox('Scribus - Usage Error',       "You have more than one object selected.\nPlease select one text frame and try again.", scribus.ICON_WARNING, scribus.BUTTON_OK) sys.exit(2) textbox = scribus.getSelectedObject pageitems = scribus.getPageItems boxcount = 1 for item in pageitems: if (item[0] == textbox): if (item[1] != 4): scribus.messageBox('Scribus - Usage Error', "This is not a textframe. Try again.", scribus.ICON_WARNING, scribus.BUTTON_OK) sys.exit(2) contents = scribus.getTextLength(textbox) while c <= (contents -1): if ((c + 1) > contents - 1): nextchar = ' ' else: scribus.selectText(c+1, 1, textbox) nextchar = scribus.getText(textbox) scribus.selectText(c, 1, textbox) char = scribus.getText(textbox) if (len(char) != 1): c += 1 continue if ((ord(char) == 34) and (c == 0)): scribus.deleteText(textbox) scribus.insertText(lead_double, c, textbox) elif (ord(char) == 34): if ((prevchar == '.') or (prevchar == ',') or (prevchar == '?') or (prevchar == '!')): scribus.deleteText(textbox) scribus.insertText(follow_double, c, textbox) elif ((ord(prevchar) == 39) and ((nextchar != ' ') and (nextchar != ',') and (nextchar != '.'))): scribus.deleteText(textbox) scribus.insertText(lead_double, c, textbox) elif ((nextchar == '.') or (nextchar == ',')): scribus.deleteText(textbox) scribus.insertText(follow_double, c, textbox)

elif ((prevchar == ' ') or ((nextchar != ' ') and (ord(nextchar) != 39))): scribus.deleteText(textbox) scribus.insertText(lead_double, c, textbox) else: scribus.deleteText(textbox) scribus.insertText(follow_double, c, textbox) if ((ord(char) == 39) and (c == 0)): scribus.deleteText(textbox) scribus.insertText(lead_single, c, textbox) elif (ord(char) == 39): if ((prevchar == '.') or (prevchar == ',') or (prevchar == '?') or (prevchar == '!')): scribus.deleteText(textbox) scribus.insertText(follow_single, c, textbox) elif ((ord(prevchar) == 34) and ((nextchar != ' ') and (nextchar != ',') and (nextchar != '.'))): scribus.deleteText(textbox) scribus.insertText(lead_single, c, textbox) elif ((prevchar != ' ') and (ord(prevchar) != 34) and (nextchar != ' ')): scribus.deleteText(textbox) scribus.insertText(follow_single, c, textbox) elif ((prevchar == ' ') or ((nextchar != ' ') and (ord(nextchar) != 34))): scribus.deleteText(textbox) scribus.insertText(lead_single, c, textbox) else: scribus.deleteText(textbox) scribus.insertText(follow_single, c, textbox) c += 1 prevchar = char

scribus.setRedraw(1) scribus.docChanged(1) endmessage = 'Successfully ran script\n Last character read was '+str(char) # Change this message to your liking scribus.messageBox("Finished", endmessage,icon=0,button1=1)