Text and Text Manipulation

After having just finished a script which involved creating a variable number, sometimes large number of text frames, along the way I learned quite a bit about how to use the various commands for styles and text. Some of this is not found in the online manual, and would be difficult to include there. What I'd like to do here is save others the time it took me to find out these important aspects of scripting.

The Setting
The particular project I was involved with took color data from a file, creating colors from that data, then making a document which displayed those colors with an informational label underneath. Here is a small detail from the eventually created pages:



This is the appearance in Scribus, so in the actual document these text frames will have no color to the border. The top line in the text frame is the name of the color, then the column of numbers underneath represent the L, a, and b values that created the above color. The input came from a plain text file, where entries look like this:

HLC 010 50 60	50	59,1	10,4 HLC 010 50 70	50	68,9	12,2 To the left we see the color name, represented by 3 letters followed by 7 digits with some intervening spaces. Although not visible here, there are tabs separating the name from the 3 following numbers and the numbers from each other. This helps as we try to parse these lines with Python. Notice that the numbers use a comma as a decimal separator, something else we will need to contend with, since to create a color from these Python will need to have floating point numbers.

Parsing the Color Data
The first line opens the file that we have identified in a fileDialog line by line. This line is imported as a string. The string is then split using a tab ('\t') as the separator, and assigned as a list to the variable content. We have previously created empty lists, color[], L[], a[], and b[]. The color name remains a string appended to colors[]. The L value is contained in the part of content[] which is 3rd from the end, the a value 2nd from the end, and b from the last item in contents. From the re module that was imported at the beginning of the script (re contains regex operations) we use the re.sub method to change the commas to periods, but of course this remains a string, so then we must convert each to a float value, which we append to the appropriate list.

Creating the Colors
This is one of the simplest parts of the script, accomplished by looping through our color data lists, then using this command: It's worth mentioning that this command is only available as of Scribus version 1.5.4svn.

Now the Document
Once we have the colors, the next step is to go on to the display of the colors and information about them. It was desired that the document be A4 paper, with units in millimeters. Looping through the colors list creates the color patches: The variables xpos and ypos begin at 25, 30, after which the array is formed mathematically, jumping 25 mm in the X direction to complete a row, and 35 mm in the Y direction to jump to the next row.

Creating the associated label text frames is best done after each color patch is created, since we can use its xpos, ypos values as a reference: Notice this variable spacer. This was created because of the variability in the length of color names; some used only one line, some two, a few three, and rarely four. Trial and error came up with this scheme to use 18 characters as the decision point to add an extra newline character or not. It's imperfect, since a proportional font is used and also since line breaks depend on length of and spacing between parts of the name. This label size will take care of most situations, but occasionally the height was too small. The answer came from check for overflowing text: This needed to be done after the particular font and font size were set.

Working with Fonts and Styles
Early on in the development of this script, I had the idea of using styles. The main reason was that the user could change fonts as desired after the script ran much easier with styles, just as on the main canvas. Setting up styles is relatively straightforward, but does have some particulars about it. Notice that the process here is to first create a Character Style, where you assign a particular font and its size. Next you create a Paragraph Style using that Character Style. All of these settings in the Paragraph Style must have an entry if you are going to use the last one, the Character Style name. From left to right, the settings are: name for the style (string), linespacingmode (integer), linespacing (float), alignment (integer), left margin (float), gap before (float), first indent (float), haddropcap (binary integer), dropcaplines (integer), dropcapoffset (float), character style name (string).

If you look at the labels in the image up higher on this page, you see I've used a bold font for the name, and regular or book style for the L, a, and b values. When I started out, I was only using one style for the entire label, and this is where I ran into a problem. On the main canvas, when you select a text frame, then select a Paragraph Style, it's applied to all the text in the frame. Unfortunately, on trying this in the script, only the last line received the assigned Paragraph Style, so it seemed that the newline character is a barrier to setting all the text. I even tried to deselect all objects, then select the frame, but this was of no help.

The initial fall back plan was to assign a font to the frame, using setFont, then setFontSize, clumsier, but it did work. Eventually I did find the solution, which came from selecting all the text of the frame, and to do this easily and repeatedly throughout the script I made a function, since there is no built-in command in Scribus to select all the text in a frame. selectText requires you to specify a starting point and number of characters to select: After this, I could then first apply the bold text style to the entire frame, then apply the book style to a subset: The variable data is the entire string of the L, a, b values, including necessary newline characters, which must be considered when counting characters in a text frame.

The Challenge of Character Styles
Something I tried and ultimately failed at was to see if I could manage to set kerning for a character style. There is something called "tracking" in the specifications, but that apparently isn't it. In the meantime, if you are going try to set more variables for createCharStyle90, this is a "legal" format: Just like createParagraphStyle, you have to fill in all these variables until you get to the last one you wish to set. Notice the first 2 are strings, the third is a floating point number, next a string, then a string, floating point, string, num, num, num, num, num, then float. This last one is called tracking, but has no apparent effect that I can see onscreen although it changes something called TXTULP in the saved SLA file. In retrospect, I think this is the value for underline offset – see below.

Problem Solved!
Here is what finally worked: It turns out that changing the tracking (kerning) is the 19th value for this command. It's also important to set the two preceding values correctly. These are respectively horizontal scaling and vertical scaling of the characters. We want to leave these alone, i.e., at the normal 100%, and the percentage is this value times 100. Weirder than this is that -50 is the setting to create a Kerning of -5.0% (!).

A Funny Thing Happened on the Way to the Paragraph Styles
I wasn't expecting it, but a "side-effect" of creating and using these styles instead of setting fonts and font sizes manually was a tremendous increase in the speed of the script. If you can imagine running this script on a file that might have hundreds or thousands of colors in it, you wouldn't be surprised at a long running time. In a few cases it was more than an hour. The largest file I have took 2 hrs, 21 min, and 19 sec to run, creating a document of 110 pages. After using styles this was reduced to 26 min, 59 seconds. Another file that took 5 minutes, 31 seconds, was reduced to 27 seconds. It's not quite clear to me why and where this speed-up occurs, but somehow applying a style with all of its features at once is much faster than applying its individual elements, and if you have 2000 colors, you have 2000 associated text frames.

So, You Use a Stopwatch with Scripts?
Actually, at first I was, but then I built in to the script a timing mechanism, using the Python module datetime. At the top of the script you add: Then somewhere early in the script you put this line: and somewhere near the end: which gives you a nice little messageBox telling you how long it took the script to run, looking something like 00:02:18, which would indicate 2 minutes, 18 seconds.

Special Characters
One of the things I wanted to do was to enter the special page number character on a Master Page I created for the script. When you edit a page in Story Editor, it shows up as a red '#', and of course in the document it shows up as the page number for the page that it is on. In a text file saved from Story Editor, it only showed up as '^^' though the color was purple. I asked one of the devs what it was, and he told me the decimal value was 30. To use this in a script you would specify chr(30): Later I found out that in hex it is '0x1e', so in Python you could alternatively say:

Now Add Page Count
We can also use a special character for the page count, the number of pages in the document. This is chr(23) or in hex 0x17: or