Advanced Graphing

Here are the Scribus operations you will find examples for in this script: This is an article about a script to create axes with tick marks, then plot data points to the scale of the graph.

This article explains the workflow that led to the script, and I imagine will continue to suggest improvements. What I wanted to do was to accumulate information about when orders were coming in according to time of day, then see if as data accumulated whether patterns became apparent.

Beginnings
Like many such projects it began quite manually, making a graph structure by hand, trying out and not liking a grid, and in essence making my own grid to lay out data points. I was getting the data from a spreadsheet, collating manually since there wasn't so much at first, then making a polyline, again manually. This gave me an opportunity to fiddle with various parameters to get the look I wanted. This ugly self-created grid would not be printed or exported to the PDF.



The data collection began in mid-April, and continues. Quite early, and you can easily see it in this graph from late April, there was this interesting peak at 4pm, so this led if nothing else to continuing the process to see what would happen with larger numbers.

As I began updating the graph, I realized I needed some help from the manual drudgery. The first thing I did was to enter the information I wanted into a postgresql database, so I could dissect in various ways, such as by time of day. Later, as the number of data rose, I wanted to be able to look at this by day of week also. This also was helped by having postgresql, where I could for example transform all instances of 23May to M23May, thus making selecting for day of week easier.

This was still quite manual, since I would export an ordered list to a text file, edit it a bit to making counting how many orders came in per hour, then use Scribus – Properties > Shape > Edit to manually change the nodes of the graph line as needed. Doable but tedious. What became especially annoying was counting lines of a text file, but of course postgresql can use an external script if you write one. select * from orders order by time; select * from orders where time < '01:00' order by time; select * from orders where time > '00:59' and time < '02:00' order by time; select * from orders where time > '01:59' and time < '03:00' order by time; select * from orders where time > '02:59' and time < '04:00' order by time; select * from orders where time > '03:59' and time < '05:00' order by time; select * from orders where time > '04:59' and time < '06:00' order by time; select * from orders where time > '05:59' and time < '07:00' order by time; select * from orders where time > '06:59' and time < '08:00' order by time; select * from orders where time > '07:59' and time < '09:00' order by time; select * from orders where time > '08:59' and time < '10:00' order by time; select * from orders where time > '09:59' and time < '11:00' order by time; select * from orders where time > '10:59' and time < '12:00' order by time; select * from orders where time > '11:59' and time < '13:00' order by time; select * from orders where time > '12:59' and time < '14:00' order by time; select * from orders where time > '13:59' and time < '15:00' order by time; select * from orders where time > '14:59' and time < '16:00' order by time; select * from orders where time > '15:59' and time < '17:00' order by time; select * from orders where time > '16:59' and time < '18:00' order by time; select * from orders where time > '17:59' and time < '19:00' order by time; select * from orders where time > '18:59' and time < '20:00' order by time; select * from orders where time > '19:59' and time < '21:00' order by time; select * from orders where time > '20:59' and time < '22:00' order by time; select * from orders where time > '21:59' and time < '23:00' order by time; select * from orders where time > '22:59' order by time;

Finally, On to the Script
I went back and looked at Creating a Graph, Part 2, which is simple, and has a number of elements of a graph, but not the actual line itself. I decided not to have the script create headers and labels for the axes, but just focus on the graph. One of the things that happened as time went on was that the Y-axis scale needed to change periodically, so I wanted to make something a bit generic.

Although up to this point my graphs did not use tick marks, I wanted to add that feature, although I wanted to move them outside the lines rather then inside. So I left 2 value dialogs, one to specify the maximum value on the Y-axis, and then at what interval I wanted tick marks. If I was going to do this, I might as well automatically generate the X-axis, but here I would have a very constant X-axis, denoting the times of day. The times were in a text frame with custom tabs, so I moved that frame to a Master Page, once I had settled on a spacing of the tick marks.

Why not just have the entire X-axis on the Master Page? Easily enough done, but as I make the PolyLine for the graph I need those X-Pos values for the tick marks, so why not just generate them on the fly for both the graph and tick marks, then I would be sure that they line up every time.

So now you see the current final version of the script. One feature of the createPolyLine command is that it wants the values to alternate – x1, y1, x2, y2, and so on. Furthermore, when it says it wants a list it means it wants a Python list, i.e., an array name. Otherwise you get an error that it can only have a maximum of 2 elements, the list name and an optional name for the Polyline. This explains the code that creates the polyvalue list in the script. I also realize that I could have combined this interleaving in the first while loop, but separating it makes understanding the code easier, in my view.

Update!

I decided for the intellectual curiosity to go ahead and import the data from a file, the format of which corresponds to the data.txt file you see above. First I made a Perl script, and since I could use regexp easily in Perl, I could strip out the parts I needed. But this still leaves me with something I would be running outside of the script in Scribus, so I decided to write a similar function in Python.

Python can use the fileinput module to get lines from a file. We skip over the first line, since that is the total number of orders, something not graphed. Next we strip off the initial "(" with line[1:], and then we can use the split string function to break the line at the space, putting the parts in the list bits[ ]. Now bits[0] will contain the string for our first value, so we convert to integer and append to the yvalue[ ] list.

One final nicety was to show our values in the dialog asking about the Y-Axis maximum value, and for this we need the string yvalues.



A newer addition is the capability to choose to not draw the axes. A Create Axes? dialog appears with a 'Yes' default, and changing to anything but that results in only the polyline being created. In case you've seen previous versions, I've also rearranged some lines so that only one test for 'Yes' is needed.

Some other things we might consider scribus.selectObject(poly)
 * adding a line to select the polyline at the end of the script:
 * adding a line to the data.txt file which could be imported as a color for the polyline.
 * making the script recursive.
 * a bigger project, of course, would be to allow for manipulation of the X-axis.



Variation on a Theme: adv_graphing.py
Here is a slight variation that is less hardwired in its treatment of the X axis, and looks at the number of data points, then creates the right number of X-axis ticks for the number of points. If nothing else, this avoids a script failure when there is a mismatch in the number of points expected and what actually are in the data file.

An example data file might look like this: (192 rows) (171 rows) (184 rows) (180 rows) (180 rows) (167 rows) (167 rows) (153 rows) so as before, the parsing strips off the first character, then uses the split string function to choose only the numbers from each line as our data points (Y values), converting the string to integer. This time, we're using all the lines as data, not ignoring the first.

This version also gives you a dialog for choosing a color of the line. The first step is to see what color choices are available, so we getColorNames and then display them in the valueDialog. I had thought the script would crash if you misspell a name to something not available, but for whatever reason, Black will be chosen (perhaps some default behavior of Scribus). If you were using the Scribus Gnome color set, this is what it would look like:



Even if the color name has a space in the middle of it, entering it just as seen in this dialog is necessary and will work (see 'Accent_Red Dark', where presumably a typo left out a second underline character).